While visual ‘no code‘ tools are helping businesses get more out of computing without the need for armies of in-house techies to configure software on behalf of other staff, access to the most powerful tech tools — at the ‘deep tech’ AI coal face — still requires some expert help (and costly in-house expertise).
This is where French bootstrapping startup, NLPCloud.io, is plying a trade-in MLOps/AIOps — or ‘computing platform as a service (as it runs the queries on its servers) — with a focus onprocessing (NLP), as its name suggests.
In recent years, artificial intelligence developments have led to impressive advances in the field of NLP. (Although it’s worth emphasizing that the bulk of NLP research has focused on the English language — meaning that’s where thisis most mature, so associated AI advances are not universally distributed.) this technology can help businesses scale their capacity to intelligently grapple with all sorts of communications by automating tasks like Named Entity Recognition, sentiment analysis, text classification, summarization, question answering, and Part-Of-Speech tagging, freeing up (human) staff to focus on more complex/nuanced work.
Production-ready (pre-trained) NLP models for English are readily available ‘out of the box. There are also dedicated open-source frameworks offering help with training models. But businesses wanting to tap into NLP still need the DevOps resource and chops to implement NLP models. NLPCloud.io is catering to businesses that don’t feel up to the implementation challenge themselves — required”.
Its API is based on Hugging Face and spaCy open-source models. Customers can either choose to use ready-to-use pre-trained models (it selects the “best” open-source models; it does not build its own), or they can upload custom models developed internally by their own datais a point of differentiation vs. SaaS Natural Language (which uses Google’s ML models) or Amazon Comprehend and Monkey Learn.
NLPCloud.io wants to democratize NLP by helping developers and data scientists deliver theseprice”. (It has a tiered pricing model based on requests per minute, which starts at $39pm and ranges up to $1,199pm, at the enterprise end, for one custom model running on a GPU. It models at low request velocity without incurring a charge.)
“The idea came from the fact that, as a, I saw many AI projects fail because of the deployment to the production phase,” says sole founder and CTO Julien Salinas. “Companies often focus on building accurate and fast AI models, but today, more and more excellent open-source models are available and are doing an excellent job… so the toughest challenge now is efficiently using these models in production. It takes AI, DevOps, and programming skills… which is why it’s a , and I decided to launch NLPCloud.io.”
The platform launched in January 2021 and now has around 500 users, including 30 paying for the service., The startup based in Grenoble, in the French Alps, is a team of three for now, plus a couple of independent contractors. (Salinas by the end of the year.)
“Most of our users are tech startups, but we also start having a couple of bigger companies,” he. “The biggest demand I’m seeing is from and data scientists. Sometimes it’s from teams with skills but doesn’t have DevOps skills (or doesn’t want to spend time on this). Sometimes it’s from who want to leverage NLP out-of-the-box without hiring a whole data science team.”
“We have very diverse customers, from solo sentiment analysis.to bigger companies like BBVA, Mintel, Senuto… in all sorts of sectors (banking, public relations, market research),” he adds. Use cases of its customers include lead generation from unstructured text (such as web pages) via named entities extraction and sorting support tickets based on urgency by conducting
Content marketers also use its platform for headline generation (via summarization). At the same time, text classification capabilities are being used for economic intelligence and financial data extraction, per Salinas. He says his experience as a CTO and software engineer working on NLP projects at several led him to spot an opportunity in the challenge of AI implementation.
“I realized that it was quite easy to build acceptable NLP models thanks to great open-source frameworks like spaCy and Hugging Face Transformers, but then I found it quite hard to use these models in production,” he explains. “It takesto develop an API, strong DevOps skills to build a robust and fast infrastructure to serve NLP models (AI models, in general, consume a lot of resources), and data science skills, of course.
“I tried to look for ready-to-use cloud solutions to save, but I couldn’t find anything satisfactory. My intuition was that such a platform would help , sometimes months of work for the teams who don’t have strong DevOps profiles.”
“NLP has been around for decades, but until recently, it took whole teams of data scientists to build acceptable NLP models. For a couple of years, we’ve regarding the accuracy and speed of the NLP models. More and more experts working in the NLP field for decades agree that NLP is becoming a ‘commodity,” he goes on. “Frameworks like spaCy make it extremely simple for developers to leverage NLP models without having advanced data science knowledge. And Hugging Face’s open-source repository for NLP models is also a significant step in this direction.
“But having these models run in production is still hard, and maybe even harder than before, as these brand new models are very demanding regarding resources.” The models NLPCloud.io offers are picked for performance — where “best” means it has “the best compromise between accuracy and speed”. Salinas also says they are paying mind to context, given that NLP can be used for diverse user cases, proposing several models to adapt to a given user.
“Initially, we started with models dedicated to entities extraction only, but most of our first customers also asked for other use cases too, so we started adding other models,” he notes, adding that they will continue to add more models from the two chosen frameworks — “to cover more use cases and more languages”.
SpaCy and Hugging Face, meanwhile, were chosen to be the source for the models offered via its API based on their track record as companies, the NLP libraries they provide, and their focus on the production-ready framework — with the combination allowing NLPCloud.io to offer a selection of models that are fast and accurate, working within the bounds of respective tradeoffs, according to Salinas.
“SpaCy was developed by a reliable company in Germany called Explosion.ai. This library has become one of the most used NLP libraries who want to leverage NLP in production ‘for real’ (instead of academic research only). The reason is that it is swift, has great accuracy in most scenarios, and is an opinionated” framework which makes it very simple to use by non-data scientists (the tradeoff is that it gives fewer .
“Hugging Face ” is an even more solid company that recentlyfor a good reason: They created a disruptive NLP library called ‘transformers’ that greatly improves NLP models’ accuracy (the tradeoff is that it is very resource-intensive, though). It allows for covering more use cases like sentiment analysis, classification, summarization… In addition, they created an open-source repository where it is easy to select the best model you need for your use case.”
While AI is advancing at a clip within certain tracks — such as NLP for English — there are still caveats and potential pitfalls attached to automating racism and misogyny. But he expresses confidence in the models they’ve selected. For example, AI models trained on human-generated data have been shown to reflect the embedded biases and prejudices of the people who produced the underlying data.and analysis, with the risk of getting stuff wrong or worse. Salinas agrees NLP can sometimes face “concerning bias issues”, such as
“Most of the time, it seems [bias in NLP] is due to the underlying data used to train the models. It shows we should be more careful about the origin of this data,” he says. “In my opinion, the best solution to mitigate this is that the community of NLP users should actively report something inappropriate when using a specific model so that this model can be paused and fixed.” “Even if we doubt that such a bias exists in the models we’re proposing, wesuch problems to us so we can take measures,” he adds.