Softlandia background

Softlandia

Blog

Recap of the First Data Science Infrastructure Meetup

Our first data science infrastructure meetup was held on 23.1.2023. The event is part of a series of meetups, targeted for data professionals and enthusiasts. The first event focused on the data science tools and infrastructure that enable productive data science and analytics projects.

The response to the first event was very positive. The next event is already scheduled for Tuesday 21.3., book your spot in Eventbrite!

Motivation to start these meetups is to keep up at a fast pace. Data science tools, methods and infrastructure are enablers to all of us professionals. Technologies in the field of data science, machine learning and AI are moving very fast. We think that together, by sharing ideas and experiences, by learning from each other, we can have the largest possible impact in this field globally. We believe having a community where things can be discussed openly is a benefit for all! Finland has a lot of potential in these fields.

Learnings from the event

We started the discussion with a round of introductions. Having +30 participants onsite & online, it was not possible to have individual intros. By raising hands we were able to recognize researchers and senior & junior professionals. It was great to see the variety of participants. We had people joining from companies that are interested in using data and AI as well as companies who are offering these services. Large corporations, startups, growth companies, technology sector and established industries were represented as well. There were also bunch of machine learning students, keenly listening. 

Event about to start

After a round of intros our host Mikko led us to the topic by introducing two recent reports focusing on the state of AI. These were

Both of these reports highlight the need for companies to purchase, hire and train talented data engineers, analysts and scientists. Especially small and medium enterprises will need assistance to make the most out of AI. Companies also find AI projects time consuming and data quality varies a lot - we addressed both of these issues in our meetups agenda!

The Nordic State of AI 2022 report discussed in detail the challenges that companies have when scaling AI to production use. To summarize, some of the obstacles are

  • Lack of shared practices around data

  • Lack of scalable infrastructure

  • Insufficient investments

  • Lack of data

  • Lack of talent

These could be addressed by proper data infrastructure design, starting from data collection to how the data is served. So if your data projects lack results, it’s possible your infrastructure does not lend itself to productive data utilization!

After Mikko’s intro we had the privilege of discussing with Ville Tuulos of Outerbounds! Ville’s presentation was about empowering data scientists with modern tools to make working with data more efficient and effective. First, Ville explained how the lifecycle of data science projects depends on humans and machines. Then he introduced us to the infrastructure that enables data scientists to work efficiently.  Since the technology stack for data projects is often tall, a key component to successful projects is effective orchestration between the correct tools.

Finally Ville showed how to implement operational data science environments that are part of the enterprise infrastructure. Not a separate island. Data science tools should become part of all operational architecture. This was our key takeaway.

If this interests you, check out Metaflow, an open-source data science tool developed by Outerbounds. You may also read more in our blog.

Ville Tuulos discussing data science projects and Metaflow

Sami Dahlman from Vaisto Solutions showed us some of their data tools and methods. The focus was using synthetic data to boost their machine learning projects. This really helps if your data quantity or quality is lacking. You may, for example, use synthetic data to train an initial model. Then you’d go on to fine-tune or use few-shot learning methods to improve your model accuracy. We got to see how Vaisto has used these methods to deliver machine vision solutions. Sami’s demos were impressive, and recent advances in generative AI will make their approach even more powerful!

Sami Dahlman presententing Vaisto's synthetic data solution

The evening concluded with Pizzas and networking 😎 We hope to see all of you back next time! Based on the feedback you sent, we’ll get a separate microphone for the audience as well :) Remember to sign up!

If you would like to present a topic that you find valuable or interesting, let us know. You can also propose topics, and we’ll get the experts on stage!

We thank the Tampere AI Ecosystem for providing us with a very nice venue and help with the practicalities!

Don't hesitate to contact us if your data or cloud projects could use a productivity boost: