VML Team Ā· Dec 21, 2023

6min read

VML 2023 Wrapped šŸš€

Another year for the books! We want to look back at the highlights from 2023. Let's dive in.

It’s been a busy time for the VML team. This was a year of document understanding, experimenting with generative models, new releases, purchase lines, and making our core products, Smartscan and Autosuggest, even better. Here’s a round up of some exciting milestones and achievements we’ve reached in 2023.

Running through our systems this year:

  • 110M documents scanned with Smartscan.
  • 190M calls to Autosuggest

This year, our community grew stronger. Welcoming new customers and integrating to new products. Warm welcome to all! We are proud that Visma Machine Learning now powers over 600K+ companies with our products.

Product Milestones

All of the coolest things we did this year:

  • New and improved Autosuggest
  • Twice as fast with FlashAttention
  • First generative model trained in May
  • Document Chat: Ask Smartscan Anything
  • Purchase Lines hit Staging
  • Streamlining Model Deployment

... and there is so much more for the extra curious reader below.

New and improved Autosuggest

One of our goals for 2023 was to expand Autosuggest, and make it as simple to use as Smartscan. We are happy to inform you that the ground work is done, and we will release the Transformer-based version of Autosuggest in 2024, drastically reducing the error rates. The new models will hit production early in 2024 (Q1).

Smartscan Developments

This year marked a record number of Smartscan releases, ensuring you always have access to the latest features. Here are some of the things that stood out:

šŸŽ New fields added to our feature set

We added six new features this year, you can find them in our feature list.

šŸŽ Twice as fast with FlashAttention

We adopted FlashAttention earlier this year, which speeds up our product. The team thoroughly investigated this for our Smartscan production model, and we're super happy with the results (delivering predictions up to 2x-3x faster).

šŸŽ First generative model "YoBerDa" trained in May

The team worked on a new Smartscan model that we have given the nickname YoBerDa. The YoBerDa model is VML's very own first generative model which was pre-trained in May and can generate responses to tasks similar to ChatGPT. Unlike ChatGPT, YoBerDa is specifically trained to analyse both text and images in order to understand documents and generate output that can describe documents.

Yoberda...

  • ... was researched our own model architecture.
  • ... has 1.4 billion parameters.
  • ...Took ~3 weeks to pre-train on 64 x A100 GPUs.
  • ... is already the model behind our question answering (QA) demo and purchase lines feature.

The Data Science team agrees that pre-training the YoBerDa model was one of the most exciting team efforts of the year:

"It was a long process involving developing the architecture, running the ablation study, and preparing for and executing the actual training, and it was just super exciting."

šŸŽ Ask Smartscan Anything with a multi-lingual QA version

We released a demo version of a question answering (QA) API, where you can ask Smartscan anything, in your own (natural) language. Our YoBerDa Model is behind this API, which has been trained on document understanding and our own existing data.

Screenshot 2023-12-21 at 13.58.33.png

šŸŽ And last but not least: Purchase Lines hit Staging

There is an exciting gift for our customers. We’re very pleased to announce that the much expected purchase line outputs are now available in Smartscan in our staging environment. The purchase lines are simply added as a new feature in Smartscan.

You will be able to get the following line items functionality in this pre-production release:

code - the product code
description - descriptive text
quantity
item_number
unit - e.g. 'pieces' or 'kg'
total_incl_vat
total_excl_vat
total_vat
percentage_vat
unit_price_incl_vat - a per item price if given
unit_price_excl_vat

To those who have contributed valuable input during the development process, thanks a ton! We are happy to align this feature with our customer’s requirements, and your feedback will help us make it even better.

Important to know:

  • We currently process up to a maximum of five pages with this API.
  • The API will be noticeably slower if you request purchase lines (it's part of the Smartscan API). We will work on improving this before releasing into production.

Streamlining Model Deployment: Navigating kServe

Implementing kServe this year has been a huge step towards a standardised and productive approach of serving and deploying ML models. One of its standout features is its ability to ensure an efficient flow of data, and supporting multiple ML frameworks, providing flexibility for a diverse range of models.

While the journey with kServe has presented its fair share of challenges, our Engineering team has shown a great commitment to continuously improve our model deployment processes. Our Lead Engineer adds:

ā€œIt’s been a challenge to implement kServe, but the level of maturity we’ve achieved puts VML as a leading team in terms of hosting ML models in Visma.ā€

Here's a link to the official website and supported frameworks.

Conference Chronicles

Our team members also attended some inspiring conferences this year.

Majority of the Data Science team joined one of the biggest machine learning conferences, NeurIPS, in New Orleans, LA. The team engaged with AI researchers from around the world and institutions like Stanford, Google, Deepmind, and Meta. One of the key takeaways: ā€œWe noticed that we are definitely at the forefront in terms of using and adopting the latest technology, such as Generative models and Flash Attention.ā€

The Engineering team participated in the WeAreDevelopers World Congress in Berlin, one of the best places to get recent insights and trends in software development. Another great conference with a lot of inspiring workshops and talks. One of the workshops was about defending against DDoS attacks when you use GCP (which we do), which we've added to our plans for 2024.

Looking ahead to 2024

Thank you for being a part of our community and for making these achievements possible. Also, a big round of applause for the team who pulled it all together.

For 2024, we have an exciting roadmap full of new product developments, more research, better data management, and more.

We wish you all a happy new year ✨

Related tags:

#product_news

Newest posts

The impact of AI on accounting

At the recent Visma e-conomic's Partner Day, I had the opportunity to talk to some of e-conomic’s...

Claus Dahl

Nov 20, 2024

Smart expense handling with the YouServe App

YouServe’s mobile app provides HR services with smart expense handling being a key feature. To...

VML Team

Nov 20, 2024

VML 2023 Wrapped šŸš€

Another year for the books! We want to look back at the highlights from 2023. Let's dive in.

VML Team

Dec 21, 2023

Automating Data Entry for small businesses in Iceland

More powerful data extraction positively impacting thousands of Icelandic entrepreneurs and small...

VML Team

Dec 15, 2023