We are looking for a talented AI Engineer to join our team and work on cutting-edge LLM and AI systems deployed on-premise.If you are passionate about building real AI products and deploying scalable inference systems, we’d love to hear from you. What you'll work onBuilding and deploying LLM / VLM systemsDeveloping RAG architectures and AI chatbotsDeploying models using FastAPI and vLLMWorking with embedding models and vector databasesPrompt engineering and model fine-tuning (LoRA, QLoRA, PEFT)Designing data pipelines for training and inferencePerformance optimization, monitoring, and cost controlAI system evaluation and metricsAI/LLM Ops and production deploymentContainerization using Docker and working with cloud environments ResponsibilitiesDeploy scalable AI inference pipelinesImprove model performance, latency, and cost efficiencyIntegrate AI systems with backend and frontend applicationsBuild and maintain production-grade AI systems Strong Python programming skillsExperience with LLMs, RAG architectures, and vector databasesExperience deploying production AI systemsFamiliarity with FastAPI, vLLM, Docker, and cloud environmentsExperience with model fine-tuning and prompt engineering