Product reviews, deals and the latest tech news

For its Git-based data collaboration platform, XetHub raises $7.5 million

Seattle-based Madrona led a $7.5 million initial funding round for XetHub, a company that simplifies how companies may utilise Git for data management. The core concept is to give developers access to data in the same way they have access to code, complete with all the collaborative capabilities made possible by a tool like Git. XetHub is a “collaborative storage platform for data management,” according to the development team.

Three veterans in the field of huge data platforms—CEO Yucheng Low, CTO Ajit Banerjee, and CTO Rajat Arya—founded the firm. In fact, Arya was the first employee of Low’s ML business Turi, which he co-founded. After Apple’s 2016 acquisition of the firm, both Low and Arya were free to contribute to different layers of the company’s machine learning (ML) platform stack. Arya now serves as head of Apple’s data platform team. Banerjee, a former employee of Inktomi, Amazon, and Facebook, was also introduced to the pair at Apple. He has also established two companies in the past.

With XetHub, you can see your repository in the same way as on GitHub, making it easy to navigate and analyse your data. XetHub accepts several file formats (including CSV) and may provide automated summaries of data.

As they developed Apple’s data platform, the team came to recognise that the field of data management still had many areas where it might be improved.

As if it were a big secret, “Data is considerably more crucial than anything else. “More crucial than the model, more crucial than everything else,” Low emphasised to me. Managing where and how you keep and share this information is crucial. Yet, we observe that the way we manage data now is remarkably similar to how source code was handled 30 years ago; that is, version control or collaboration is done by copy-and-paste; sometimes there is a more elaborate version of it, but ultimately it is copy-and-paste if I want to make sure no one else is touching what I’m doing.

XetHub’s goal is to provide developers the same kind of comfortable, familiar primitives they’ve come to rely on using tools like Git for collaborating on source code to use when dealing with data.

This, according to Low, “allows developers to work with data in precisely the same manner as code for the first time.” He said that the team’s goal was to produce a tool that retained the fundamental Git user experience, down to the connectors that developers are already used to using.

XetHub is an expansion of Git that allows for the storing and transmission of enormous files with data deduplication and complete Git compatibility.

The service can now manage repositories up to 1TB in size, with future expansion to 100TB planned. Even while most developers won’t want to clone such a massive repository, the ability to mount it and have it act like a local file system is a nice perk for those that want. This is true whether the developer is working on a laptop or a massive GPU cluster. It’s also important to note that the programme doesn’t care what kind of file you’re working with.

While the marketing team’s attention is focused on AI/ML groups, users are free to utilise XetHub for data management of any sort.

A free community version of Xethub is now available, allowing users to manage up to 20 GB of deduplicated storage at no cost. According to Low, the business is in talks with a few large corporations, but the group isn’t ready to disclose any names just yet.

Apple is the most recognisable name in consumer technology, and Yucheng and the amazing XetHub team have been at the forefront of machine learning innovations for over a decade. According to Madrona’s managing director Matt McIlwain, “XetHub allows developers to work with enormous datasets, in cooperation with others, to create intelligent and generative apps.” From the developer’s perspective, “old infrastructure and complicated data operations are limiting the ability to design and deploy these apps,” and XetHub solves this problem.