Join me for an interview with Don Sheibenreif, a distinguished analyst at Gartner and co-author of “When Machines Become Customers” when Hashtag Trending, the Weekend Edition goes to air on Saturday morning. Spoiler alert. The answer to When Machines Become Customers – is sooner than you think.
Generative AI goes Open Source, Amazon Web Services takes on Google and Microsoft with a new offering to Democratize AI and is the Broadcom purchase of VMWare doomed by a European commission?
These stories and more on Hashtag Trending, for Friday April 14th. I’m your host Jim Love, CIO of ITWC – IT World Canada and TechNewsDay in the US.
Hello Dolly – GPT Generative AI goes open source. Analytics firm Databricks has released Dolly 2.0 a large AI text model similar to ChatGPT. Dolly can handle chatbot, text summarizing and basic search.
But here’s a critical point, Dolly is licensed as open source, but with a license that allows independent developers and even companies to use it for commercial purposes.
The company’s blog states that “We are open-sourcing the entirety of Dolly 2.0, including the training code, the dataset, and the model weights, all suitable for commercial use. This means that any organization can create, own, and customize powerful LLMs that can talk to people, without paying for API access or sharing data with third parties.”
Perhaps even more important is that Dolly 2.0 was not trained using any answers from ChatGPT or any other proprietary model. Databricks used volunteers from its employees to generate a data set that it says, is “not tainted.”
What they mean by not tainted is that anyone using materials from ChatGPT is bound by the restriction that they not create a model that competes with OpenAI. That restriction applies to a number of other models that have been developed recently. We did a story on Stanford’s Alpaca, but there are a list of others. All of which, it is assumed have the same restriction.
How robust is the training of Dolly? Again, according to Databricks’ blog post, they used their 5,000 employee volunteers to come up with 15,000 instructions. To put this in perspective, this is more than the 13,000 instructions that were used as the benchmark in a recent paper for a model called Instruct GPT which reported that it had greater accuracy than ChatGPT 3.5
There is one question that needs to be asked. Kyle Wiggers, who wrote about this story in TechCrunch asked why is a commercial, for profit company, whose “bread and butter is data analytics – open sourcing a text-generating AI model?”
The answer, according to Databricks CEO Ali Ghodsi, is Philanthropy.
He said, “we are in favor of more open and transparent large language models (LLMs) in the market in general because we want companies to be able to build, train and own AI-powered chatbot and other productivity apps using their own proprietary data sets,” “We might be the first but hope not to be the last.”
Databricks has made the model and even instructions available on their website. A link to those resources in included in the text version of this podcast.
CEO Ghodsi notes that this is just the beginning.“Databricks,” he said,” is deeply committed to making it simple for customers to use LLMs,” he said. “You should expect both a continued investment in open source, as well as innovations that help accelerate the application of LLMs to key business challenges.”
Sources: Databricks blog, TechCrunch
Amazon Web Services announced its entry into Generative AI taking on both Google and Microsoft with a two-pronged approach that they say democratizes AI – making large AI models accessible to enterprises and also providing services for smaller companies and developers.
The enterprise offering us called Amazon Bedrock. Bedrock makes what Amazon is referring to as Foundational Models (FMs) like OpenAI and others available to companies and adds its own value proposition – having pre-packaged offerings and its scalable infrastructure to, it its words, democratize access for builders.
Bedrock offers access a range of powerful Foundation Models for text and images—and it also includes Amazon’s Titan FMs, two Large Language Models that Amazon announced today.
The vision is one of a serverless experience, allowing companies to find and pick the right model, get up and running quickly and to customize these models with their own data as well as integrating them into their own corporate applications.
AWS will have instances they are calling Inf2 which are optimized specifially for large scale generative AI applications, with models able to deal with billions of parameters.
But they also announced what they call Amazon CodeWhisperer aimed a developers and leveraging Generative AI to the “heavy lifting” by writing much of what they call undifferentiated code, freeing up developers to do the more “creative aspects of coding.”
Apple announced on Thursday that it would only use recycled cobalt in batteries by 2025. This is part of its stated efforts to have all its products be carbon neutral by 2030.
Apple also mentioned that other rare elements will use recycled materials, for example, using recycled tin soldering and gold plating on circuit boards. Recycling of many metals is often less energy intensive than mining and refining them.
But cobalt is a special case. It is a critical component of batteries used in consumer electronics. It is also a byproduct of copper or nickel mining, but the best source comes from large, easily accessible surface deposits in the Democratic Republic of Congo.
Under the current regime, there are reports of deaths of children who are forced to mine the cobalt. Several tech companies have been accused of being complicit in these deaths by continuing to source their cobalt from the southern Congo.
Apple reported that was making progress towards its goal. A quarter of all cobalt used in Apple’s products came from recycling in 2022, that an increase of 13 per cent from the previous year. They also reported that they now source over three quarters of rare earth materials, including two thirds of the aluminum and more than 95 per cent of the tungsten from recycled materials.
Broadcom’s purchase of VMWare hit another snag today when the European Commission filed a statement of objection claiming that the purchase may be harmful to competition.
The Brussels based commission expressed concerns about a wide range of suppliers who might be adversely affected including suppliers whose parts interoperate with VMWare’s hypervisor – claiming that under the Broadcom takeover, these vendors will lose or have restricted access to VMWare, the virtualization software.
This would allow Broadcom to potentially make changes to prevent competitors from using VMWare software with anything but Broadcom hardware.
It’s not the first time that these objections have been raised. Last month the UK Competition and Markets Authority found the deal could be harmful to competition for almost the same reasons cited by the European Commission, noting also that even if Broadcom allowed access to VMWare software, Broadcom would potentially have access to information about their competitors, who would have to disclose this information as part of the process of maintaining their compatibility with VMWare.
In addition, the US Federal Trade Commission is also reported to be looking into the deal.
According to other media reports, Broadcom is not moving to “appease the regulators.”
The hackers who breached Western Digital in early April have claimed that they have stolen 10 terabytes of data including customer information. According to a report in TechCrunch, the hackers are looking to negotiate a payment in the 8 figures, 10 million dollars or more in exchange for not publishing the data they have stolen.
Up til this point, Western Digital has disclosed what they called a “network security incident” but have not specified what data may have been stolen.
One of the hackers has spoken with TechCrunch and provided details to back up their claims, including a file that “digitally signed” with Western Digital’s code-signing certicate, proving they could digitally sign files and effectively impersonate Western Digital.
They also shared phone numbers which they claimed belonged to company executives.
The hackers are claiming their goal was to extract a ransom, but decided against encrypting Western Digital’s files.
The ransom negotiations have, according to reports, been difficult. One of the hackers stated that they had called the executives of the company many times, but “they don’t answer and if they do, they listen and hang up.” They have also tried to email executives using personal email addresses since the companies email system is currently down.
Aside from the sample files, the hackers who spoke with TechCrunch would not specify what kind of customer data they had or how they broke into Western Digital’s network. Western Digital spokesperson Charlie Smalling said that the company declined comment and would not answer question or confirm the data that was stolen.
The hackers also declined to say anything about their gang, or to even give a name. If they don’t get the ransom from Western Digital, they say they will publish data on the website of another ransomware gang, known as Alphv, who they say they are not related to or affiliated with, but they do say that they regard Alphv as being professional.
And in one of the scariest new uses of AI technology, deep fake voices are being used to deliver what is being called “swatting as a service.”
Swatting as a service is a threat for hire, where you can pay someone to phone in a threat that will draw a SWAT team to a location or place. In some of the examples, the call could pretend to be a criminal who wants to confess, someone claiming to have planted explosives or threatening a mass shooting – anything that would bring in armed SWAT team.
Not only is this a waste of police resources and a frightening experience for those at the home, school or business that is surrounded or even raided by the police, but it can be lethal. At least one case has been reported to have resulted in the unsuspecting victim being killed.
The publication Motherboard says that they have traced the source to a Telegram account and a group called Torswats which offers to “close down a school for $75 or have an “extreme swatting” which results in the victim being handcuffed and their house searched for $50.
In one case, the police have arrested and charged a 16-year-old, but Torswats remains operational and able to continue to carry out threats at scale using a synthesized voice to sound authentic, but to not be identifiable. They have also been able to respond to operator’s questions in close to real-time, like “where are you located?” or “what happened?” and even “what is your name?”
SWATTing or its other near relative DOXing, where you publish someone’s location and identifying information to allow them to be attacked or intimidated in person or on social media is rampant, but the use of AI and deep fakes and an “as a service” component takes adds a troubling dimension to a new level.
These stories and more on Hashtag Trending for Friday, April 14th
I’m your host Jim Love, CIO of IT World Canada and TechNewsDay in the US – here’s today’s top tech news stories.
That’s the top tech news for today. Hashtag Trending goes to air five days a week with the daily tech news and we have a special weekend edition where we do an in depth interview with an expert on some tech development that is making the news.
Follow us on Apple, Google, Spotify or wherever you get your podcasts. Links to all the stories we’ve covered can be found in the text edition of this podcast at itworldcanada.com/podcasts
We love your comments – good or bad. You can find me on LinkedIn, Twitter, or on Mastodon as @therealjimlove on our Mastodon site technews.social. Or just leave a comment under the text version at itworldcanada.com/podcasts
I’m your host, Jim Love, have a fabulous Friday!
The post Hashtag Trending Apr.14-Generative AI goes open-source, AWS debuts large language model, Broadcom’s purchase of VMWare hits another roadblock first appeared on IT World Canada.