Freelancing Opportunity
Hrs Required : 5 hrs Daily
Job Title: Python AI/ML Data Engineer
About the Role
We are looking for a Python AI/ML Data Engineer to build intelligent automation solutions for Contract Lifecycle Management (CLM). This role focuses on applying unsupervised machine learning, NLP, and fuzzy matching techniques to process, classify, and match large volumes of unstructured contract data. You will work on scalable data pipelines that reduce manual effort and improve accuracy through human-in-the-loop AI systems.
Key ResponsibilitiesAI \& Machine Learning
Design and implement unsupervised ML models using scikit-learn to cluster and profile contract templates.
Perform similarity analysis using techniques such as TF-IDF, cosine similarity, or distance-based models.
Develop fuzzy matching logic (RapidFuzz / fuzzywuzzy) to reconcile extracted contract attributes with historical back-office data.
Continuously refine models using feedback from manual review and stewardship teams.
Python Automation \& Data Engineering
Build Python scripts to parse, transform, and validate complex nested JSON contract data.
Extract contract attributes (products, pricing, clauses, renewal terms, dates) using Regex and NLP techniques.
Create batch-processing pipelines to handle 20–50+ products per run with deduplication and data integrity checks.
Flatten unstructured data into structured, analysis-ready formats (CSV / Excel / relational tables).
Data Validation \& Quality Control
Automatically flag data gaps, inconsistencies, and anomaly cases for manual review.
Generate validation reports and exception logs for business and compliance teams.
Reporting \& Visualization
Prepare structured outputs for dashboards tracking coverage %, match confidence, and model accuracy.
Support reporting integrations with Power BI / Google Data Studio / Looker.
Required Skills \& Qualifications
Strong proficiency in Python.
Hands-on experience with scikit-learn for clustering and similarity modeling.
Advanced usage of pandas, numpy, and Python data pipelines.
Strong knowledge of Regex and text processing.
Experience parsing and transforming JSON and semi-structured data.
Practical experience implementing fuzzy string matching in real-world use cases.
Preferred Qualifications
Experience with NLP libraries such as spaCy, NLTK, or similar.
Background in Contract Lifecycle Management (CLM), legal documents, or compliance data.
Experience with feature engineering for unstructured text.
Familiarity with human-in-the-loop or semi-automated AI systems.
Tech Stack
Programming: Python 3.x
ML \& Data: scikit-learn, pandas, numpy
Text \& Matching: Regex, RapidFuzz / fuzzywuzzy
Data Formats: JSON, CSV, Excel
Tools: Git, Jupyter Notebook, Power BI / Looker
Job Types: Part-time, Freelance
Contract length: 12 months
Pay: ₹40,000.00 - ₹50,000.00 per month
Benefits:
Work from home
Experience:
total work: 6 years (Required)
Shift availability:
Night Shift (Required)
Overnight Shift (Required)
Work Location: Remote