Home Machine Learning Machine Learning News LinkedIn Open-Sources Dagli

LinkedIn Open-Sources Dagli

Audio version of the article

LinkedIn has recently announced the open-sourcing Dagli, a machine learning library for Java and other JVM languages. This open-source machine learning library will ostensibly make it easier for developers to create bug-resistant, easily readable, modifiable, maintainable, as well as deployable model pipelines without incurring technical debt.

According to the data report, as the industry of machine learning matures and increases with innovative applications, the majority of companies, approximately 50% spend between 8 and 90 days deploying a single machine learning model — with 18% taking longer than 90 days. A lot of this could be attributed to the inability to scale, along with the challenges that come with model reproducibility, and the lack of executive buy-in, and poor tooling.

With this open source machine learning library, the model pipeline is defined as a directed acyclic graph, consisting of vertices and edges, stated in the news media. These vertices and edges are directed from one vertex to another for training and inference, stated in the news media. The environment of open source Dagli provides developers with the pipeline definitions, near-ubiquitous immutability and static typing.

When asked Jeff Pasternack, the LinkedIn NLP research scientist, he wrote in a blog post that models are traditionally part of an integrated pipeline, and therefore the constructing, training, and deploying these pipelines to production remains a challenging task. “Duplicated or extraneous work is often required to accommodate both training and inference, engendering brittle ‘glue’ code that complicates future evolution and maintenance of the model,” stated Pasternack.

The machine learning library — Dagli works on servers, Hadoop, command-line interfaces, IDEs, and other typical JVM contexts. It also comes with plenty of pipeline components that are built-in for ready to use, including neural networks, gradient boosted decision trees, logistic regression, FastText, cross-validation, feature selection, cross-training, data readers, evaluation, and feature transformations.

For professionals and experienced data scientists, Dagli offers a path to create production-ready AI models that are maintainable and extensible in the long term, and also can leverage an existing JVM technology stack. However, on the other hand, for less experienced software engineers, this machine learning library provides an API that can be used to avoid typical logic bugs, when used with a JVM language and tooling.

According to Pasternack, Dagli is created to make efficient, production-ready models that are easier to write, revise, and deploy. Further it will also avoid the technical debt and long-term maintenance challenges. Dagli, further, leverages modern, highly multicore processors and powerful graphics cards for effective single-machine training of real-world models.

The launch of Dagli comes after LinkedIn made available the LinkedIn Fairness Toolkit (LiFT), which is an open-source software library designed to enable the measurement of fairness in AI and machine learning workflows. LinkedIn also debuted DeText, an open-source framework for NLP-related ranking, language generation tasks as well as classification task. It leverages semantic matching, using deep neural networks to understand member intents in search and recommender systems.

This article has been published from the source link without modifications to the text. Only the headline has been changed.

Source link

- Advertisment -

Most Popular

Understanding Nearest Centroids linear classification in Machine Learning

Nearest Centroids is a linear classification machine learning algorithm. It involves predicting a class label for new examples based on which class-based centroid the example...

Development careers that use Python

Over the past few decades, several programming languages ​​have been invented, and many have gained space in the development market. JavaScript, C ++, Swift,...

Data Science Strategies for post Trump Era

Data can be a source of comfort and confidence as the pandemic rages on and the Trump Administration comes to a close. Now that the...

Upcoming opportunities in Crypto

A record-breaking year for DeFi, but can it break away from the negative perceptions around yield farming and pump-and-dump schemes? While 2020 will go down...

Introductory Guide on XCFramework and Swift Package

In WWDC 2019, Apple announced a brand new feature for Xcode 11; the capability to create a new kind of binary frameworks with a special format...

Understanding Self Service Data Management

https://dts.podtrac.com/redirect.mp3/www.dataengineeringpodcast.com/podlove/file/704/s/webplayer/c/episode/Episode-159-Isima.mp3 Summary The core mission of data engineers is to provide the business with a way to ask and answer questions of their data. This often...
- Advertisment -