Accelerate your production ML journey with Refract

Reading Time: 2 minutes

As we all know, production ML (Machine Learning) is more engineering than machine learning. Building a prototype in machine learning has become very simple nowadays, all thanks to different open-source projects like sci-kit, TensorFlow, Keras, etc. But operationalizing that model to get the insights from the model which can be used in day-to-day business decisions is challenging and needs more engineering knowledge than data science knowledge.

That’s where  Fosfor Refract can contribute to help its users to setup standard processes and tools to avoid common pitfalls of engineering around machine learning and take them through a guided approach to create the best-in-breed production ML engine.

Here is a detailed narrative of some of the common challenges we face on a regular basis and how Refract solves them:

  • Decoding Coding: We have heard many developers say that this code was working on my laptop and not sure why it’s failing in production. The primary reason for this is, in any ML project we develop, mostly the user code contributes to 1% of the overall package and 99% is contributed by third-party packages and OS level dependencies. So, it is very important to preserve all the attributes you used during the development to be shipped to production. Refract does the same thing by giving a centralized configurable environment for data scientists to write their model code. During operationalizing your model code, all the dependencies are shipped along making it a seamless journey for its users.
  • Overcoming Repetitive Patterns: In a large organization, project similarities are a common problem and so is a duplication of efforts. In the current siloed development approach, this is unavoidable. Refract offers a collaborative environment across the enterprises to avoid such scenarios. Any data scientist can go through the entire catalog of models or use cases created earlier and adopt it in its current project, or continue enhancing to suit their needs rather than starting from scratch.
  • Leaving behind Legacy Models: Maintaining the reproducibility of a model is very critical for debug any compliance reasons. But it is very difficult to do so as in ML project code, it is not the only thing that changes over time, in fact, data and model parameters also change the model behavior. Refract simplifies this complete orchestration and stores hyperparameters, model code, data used for every model runs. You can always look back at that data and reproduce the complete run with minimal efforts.
  • Expanding Training Capabilities: Setting up the workspace with a large training capacity is always time-consuming and requires involvement of multiple teams. Refract automates the complete infrastructure procurement, be it CPU or GPU, and gives its users flexible coding environments within the procured infrastructure. This boosts the overall productivity of data scientists and removes all unnecessary learning curves for data scientists around the infrastructure setup.
  • Repetitive tasks for different use cases: Don’t repeat yourself is a powerful concept and reduces a lot of unnecessary efforts if applied to ML development. Refract abstracts a lot of complex operations into a re-usable SDKs which can help data scientists to fetch data from any data source, do error analysis or even deploy their models with just a single line of code rather than re-inventing the same thing again and again.
  • Testing with 1000 Concurrent Users: Scaling deployed models to serve a lot of concurrent users is a different project. Refract ships prebuilt and tested templates, which can be used to deploy the models, and underneath all the scaling mechanisms are already taken by the platform. So, users can have a sigh of relief and just concentrate on creating better accurate models and the system will take care of all their engineering needs around serving models.
  • Model Deployed? But There’s More: A lot of teams we talk to, don’t have a standard model monitoring in place and that leads to inaccurate models in production over time due to changing business needs. It’s very important to keep monitoring your production models and take the necessary actions if any deterioration is observed. Refract offers a complete observable framework, which can help you detect changes to incoming data, understand the change in prediction behavior, monitor the model accuracy over time, and raise appropriate alerts to help you make a proactive decision to retrain your model. It also helps you manage multiple versions of models and help you compare better.
  • The Road Ahead: With the plethora of best practices ingrained into every step and an intuitive user experience make Refract as one of the comprehensive platforms for any data science teams. It reduces time-to-market with all the automation, increases innovation with the right collaboration, and enables more trust in your models with the right governance structure in place.

Author

Shivanand Pawar

Product Head for Refract

Shivanand Pawar is Product Head for Refract, a proprietary AI Platform by Fosfor. With 10+ years of experience, he is a hands-on expert in Data Science, Big Data, Kubernetes, and Application development. Shiva is actively involved in consulting with clients across various domains in adopting Big data and Machine learning and operationalizing AI / ML at scale. He brings in immense knowledge around the challenges.

Latest Blogs

See how your peers leverage Fosfor + Snowflake to create the value they want consistently.

Prompt Engineering: The New Era of AI

Trends are changing in the data science domain every day. Many tools, techniques, libraries, and algorithms are developing daily. This constantly changing landscape keeps the data science domain at the bleeding edge. The techniques and methods used to solve different tasks in Machine Learning (ML)/ Deep Learning (DL) and Natural Language Processing (NLP) are also changing.

Read more

Harnessing the power of Lumin and Streamlit in Snowflake: From Data Exploration, Visualization to Decision Intelligence

As a data enthusiast myself, I understand the importance of data exploration, visualization, and decision intelligence in today's data driven world. That is why I am thrilled to share with you how Lumin and Streamlit can revolutionize your data analysis experience in Snowflake, the leading cloud-based data warehouse platform. So, fasten your seatbelts as we embark on a journey to harness the true power of Lumin and Streamlit to unlock the potential of your data.

Read more