Job Type: Contract
Job Category: IT

Job Description

Hiring: Unstructured.io Developer
Location: Remote (Boston, MA)
Contract: 6–12 Months (Extendable)

 

Job Summary: We are seeking an experienced Unstructured.io Developer to work on enterprise-grade data ingestion and document processing solutions. The ideal candidate will have strong hands-on experience with Unstructured.io framework, data transformation pipelines, and integration with LLM / Vector DB / Search platforms. In this role, you will develop and optimize workflows for parsing, cleaning, and indexing complex enterprise documents.

 

Key Responsibilities

  • Develop and enhance data processing pipelines using Unstructured.io for converting unstructured data (PDF, DOCX, HTML, Emails, Scans) into structured formats.
  • Integrate extracted data with Vector Databases or Search Indexing workflows for LLM/RAG applications.
  • Optimize parsing performance, accuracy, and consistency across various document formats.
  • Work with Python-based microservices, APIs, and orchestration frameworks.
  • Collaborate with Data Engineering, ML, and Product teams to design scalable ingestion architectures.
  • Implement best practices for scalable, reusable pipeline components.
  • Monitor, debug, and resolve pipeline issues across staging and production environments.

 

Required Skills & Experience

  • Overall IT Experience: 8+ Years
  • 3+ years hands-on experience implementing Unstructured.io in production environments.
  • Strong experience with Python, including parsing, data transformation, and API development.
  • Experience building RAG (Retrieval-Augmented Generation) or Document AI workflows.
  • Hands-on with Vector Databases (Pinecone, Weaviate, Chroma, FAISS, Milvus, etc.).
  • Familiarity with Cloud Platforms (AWS preferred).
  • Experience with Docker, Git, CI/CD pipelines.

 

Nice to Have

  • Experience with frameworks like LangChain / LlamaIndex.
  • Knowledge of NLP, embeddings, and tokenization.
  • Experience integrating with LLM providers (OpenAI, Anthropic, Azure OpenAI, etc.).
  • Familiarity with document OCR tools (Tesseract, Azure Form Recognizer, AWS Textract).

Required Skills
Cloud Developer SQL Application Developer

Fill below details & click “Apply”

Only add 10 digit number without prefix
Resume can be attached in PDF, JPG, Word , Txt format only

Share This Job