Job Title: GenAI Architect
Location: North Quincy, MA (Onsite)
Duration: 6 Months
Employment Type: Contract
Job Summary
We are seeking an experienced GenAI Architect to lead the design, development, and deployment of enterprise-grade Generative AI solutions. The ideal candidate will have deep expertise in modern AI/ML frameworks, foundation models, LLM architectures, vector databases, and cloud-native ecosystems. This role involves close collaboration with data scientists, engineers, business stakeholders, and security teams to architect scalable and compliant AI systems tailored to business needs.
Key Responsibilities
· Architect end-to-end Generative AI solutions, including model selection, fine-tuning, prompt engineering, retrieval-augmented generation (RAG), and deployment strategies.
· Develop scalable AI application architectures leveraging cloud platforms (AWS/Azure/GCP), containerization, orchestration, and MLOps pipelines.
· Evaluate, customize, and integrate foundation models (LLMs, Vision Models, Multimodal Models) into enterprise workflows.
· Lead the design of vector search, embeddings, and knowledge-retrieval systems for enterprise data.
· Partner with security and compliance teams to ensure AI systems meet data governance, risk, and regulatory requirements.
· Provide hands-on technical mentorship to engineering and data teams implementing AI features and pipelines.
· Define best practices for model monitoring, drift detection, data quality management, and model lifecycle management.
· Collaborate with product and business teams to translate use-case requirements into robust AI solution architectures.
· Conduct POCs, feasibility studies, and performance benchmarking for new AI technologies and models.
Required Qualifications
· Bachelor’s or Master’s degree in Computer Science, AI, Data Science, or related field.
· 8+ years of experience in software or data engineering, including at least 3+ years focused on AI/ML and hands-on experience in GenAI/LLMs.
· Deep knowledge of LLM frameworks (HuggingFace, LangChain, LlamaIndex, TensorFlow, PyTorch).
· Proven experience with RAG systems, embeddings, vector databases (Pinecone, FAISS, Weaviate, Milvus).
· Strong understanding of cloud services (AWS Sagemaker, Azure OpenAI, Google Vertex AI) and MLOps tools (MLflow, Kubeflow).
· Experience fine-tuning LLMs and working with prompt engineering techniques.
· Expertise in Python and familiarity with API development, microservices, and serverless architectures.
· Knowledge of data privacy, compliance, and responsible AI principles.
Preferred Qualifications
· Experience implementing multimodal AI (text, vision, audio).
· Background in financial services or regulated enterprise environments.
· Familiarity with vector indexing optimization, distributed model inference, or model compression techniques.
· Prior experience leading technical architecture for large-scale AI deployments.