Tag: Large Language Model

Beyond Major AI: How India is Shaping its Distinct AI Revolution?

IT Minister Ashwini Vaishnaw stated that India aims to develop small language models (SLMs) to address “distinct” issues, in addition to a large language model (LLM). Vaishnaw stated that the government is examining its own digital models with smaller models that can address specific, distinct issues.

During a virtual address at the Nasscom Technology and Leadership Forum 2025, the IT minister stated that the government is concentrating on generating a substantial quantity of non-personal and anonymised datasets for the training of domestically developed fundamental artificial intelligence (AI) models. Vaishnaw also enquired whether the country can establish a robust research foundation. India has established centres of excellence (CoEs), three of which are presently operational, with an additional one allocated in the budget.

Digital Public Infrastructure Acting as a Catalyst

Vasihnaw emphasised that structured data derived from state-supported digital public infrastructure (DPI) will enable India to “distinguish itself” in the forthcoming months and years in the AI competition. Vaishnaw observed that the Centre is utilising AI throughout the India Stack, including services like Digi Yatra, Aadhaar, and UPI. The Minister stated that the outcomes had been exceptional. The efficacy of DPI can be amplified by factors of 10, 15, or even 100, and we incorporate AI in this process.

This will be a significant advantage for us due to our existing foundation. In a virtual fireside chat, Vaishnaw announced that Japan has granted a patent for UPI’s “gateway system.” A distinguished media site reported that the IT minister indicated the nation intends to shift from a services-orientated economy, particularly in IT services, to a product-centric powerhouse in sectors including semiconductors, AI, and consumer electronics. He stated that India’s aspirations in AI surpass simple service delivery or application creation. India might have confined itself to being a hub for use cases and application service provision, but the nation aspires to achieve far greater ambitions.

Government’s Future Plan in the AI Sector

The minister stated that the Centre is pursuing a comprehensive AI strategy, which encompasses the development of indigenous fundamental models, the creation of anonymised non-personal datasets for training, the establishment of centres of excellence for AI research, and the integration of AI education into institutions. The IT minister disclosed that 25 semiconductor products will be developed at the five upcoming semiconductor units in the country, where construction is currently ongoing.

Regarding AI governance, Vaishnaw emphasised that the Centre will maintain a regulatory approach that fosters innovation rather than hinders it. “We must address the potential harm to society and regulate it; yet, we should not allow innovation to be suppressed as has occurred in numerous other nations,” he stated.

This follows closely behind reports indicating that the Centre has received a minimum of 67 proposals for the development of a domestic AI basic model. The government has allegedly received offers for the development of 20 LLMs from domestic AI businesses, including Sarvam AI, CoRover, and Ola’s Krutrim.

India Working on AI Governance Regulations: FM Nirmala Sitharaman

Finance Minister Nirmala Sitharaman confirms India is developing AI governance regulations to ensure responsible and ethical artificial intelligence use.

StartupTalkyNitin Konde

February 25, 2025
Amid the DeepSeek Frenzy, Meta Plans to Invest “Hundreds of Billions of Dollars” in AI

Mark Zuckerberg, the CEO of Meta, isn’t overly concerned about DeepSeek’s ascension, even though the Chinese AI lab’s rapid rise has shocked Wall Street and Silicon Valley. In fact, Zuckerberg stated on January 29 that Meta’s open-source strategy, which is based on the large language model (LLM) Llama, has “strengthened our conviction that this is the right thing for us to be focused on.”

“There’s a number of novel things that they did that we’re still digesting… a number of advances that we will hope to implement in our systems, and that’s part of the nature of how this works,” Zuckerberg stated on the company’s earnings conference call. Every new firm that launches, whether or not it is a Chinese competitor, will have some new innovations that the rest of the industry may learn from, according to the head of Meta.

DeepSeek’s Gain Causing Tremors Among Established Players

With its boasts of creating a model that can compete with top-tier models from American companies like OpenAI, Meta, and Google for a fraction of the cost, DeepSeek has thrown Wall Street into a collapse over the past week, especially with regard to AI-related equities. Investors were alarmed by this since IT companies were spending billions of dollars developing their AI models and goods.

Zuckerberg stated during the earnings call that he continues to think that making significant investments in infrastructure and capital expenditures will eventually provide a competitive edge. “It’s probably too early to have a strong opinion on what this means for the trajectory around infrastructure and capex,” he stated.

Meta’s Plan to Outrun its Competitors

According to Zuckerberg, Meta plans to spend “hundreds of billions of dollars” on AI infrastructure in the long run. He declared last week that Meta will increase its AI efforts by investing between $60 billion and $65 billion in 2025. According to him, a large portion of the compute infrastructure will probably transition from the pre-training stage to creating strong “reasoning” models and superior products that will be sold to billions of customers.

Because you can “apply more compute at inference time in order to generate a higher level of intelligence and a higher quality of service,” Zuckerberg stated that this “doesn’t mean you need less compute.”

“As a company that has a strong business model to support this, I think that’s generally an advantage that we’re now going to be able to provide a higher quality of service than others who don’t necessarily have the business model to support it on a sustainable basis,” he stated.

Launch of Llama 4 in the Upcoming Month

In the upcoming months, Meta intends to release Llama 4 with native multimodal and agentic capabilities. “Llama 4’s training is going really well. Pre-training for Llama 4 mini is complete, and both our reasoning models and the larger model appear to be doing well,” Zuckerberg stated.

“With Llama 3, we wanted to make open source competitive with closed models, and with Llama 4, we want to lead,” he continued. Zuckerberg said that it will be feasible to create an AI engineering bot with coding and problem-solving skills comparable to those of a competent mid-level engineer by 2025.

DeepSeek to Operate on Indian Servers, Says Union Minister

Union Minister confirms DeepSeek will soon run on Indian servers, addressing privacy concerns and enhancing data security for Indian users.

StartupTalkyNitin Konde

January 31, 2025
Launched by Sarvam AI, Sarvam 1 LLM is Trained in English and Ten Indic Languages

On October 24, Sarvam AI, an artificial intelligence (AI) firm supported by Lightspeed, unveiled Sarvam 1, a Large Language Model (LLM). According to a tweet on X (previously Twitter), the business says it is India’s first indigenous multilingual LLM, trained from scratch on domestic AI infrastructure in ten Indian languages and English.

Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu are among the ten major Indian languages that Sarvam 1 supports in addition to English. The LLM uses a two-billion-parameter language model and is trained on Nvidia’s H100 Graphics Processing Unit (GPU).

Sarvam AI uses Nvidia services and AI4Bharat’s open-source technology

In order to optimise and implement conversational AI agents with sub-second latency, Sarvam AI also makes use of a variety of Nvidia services and products, including its microservice, conversational AI, LLM software, and inference server.

In addition to Nvidia, the LLM made use of AI4Bharat’s open-source technology and language resources, as well as Yotta’s data centres for computational infrastructure. According to a blog post by the AI startup, Sarvam-1’s strong performance and computational efficiency make it especially well-suited for real-world uses, such as deployment on edge devices.

In specifics, Sarvam 1 clearly beats Gemma-2-2B and Llama-3.2-3B on a number of common benchmarks, such as MMLU, Arc-Challenge, and IndicGenBench, while attaining comparable results to Llama 3.1 8B, the company stated.

Functioning of Various LLM Models Launched by the Company

India’s first Hindi LLM, Open Hathi, was introduced by the AI firm in December 2023. The Llama2-7B architecture from Meta AI, which has 48,000 token extensions, served as the foundation for the model. However, a training corpus of two trillion tokens is used to develop Sarvam.

Because of its effective tokeniser and unique data pipeline, which can produce diversified and high-quality text while preserving factual correctness, the LLM has two trillion tokens of synthetic Indic data. In addition to being four to six times faster during inference, Sarvam claimed that the most recent model from their stable meets or surpasses much larger models like Llama 3.1 8B.

The process by which a trained model predicts or deduces from fresh data using the patterns it discovered during training is known as inference in artificial intelligence. Compared to current Indic datasets, the companies’ pretraining corpus, Sarvam-2T, supports eight times as much scientific material, three times as high quality, and two times as long documents. The total number of Indic tokens stored by Sarvam-2T is around 2 trillion. Apart from Hindi, which makes up over 20% of the data, the data is distributed nearly evenly among the ten supported languages.

AI Firm Sarvam Unveils Blend of Open Source and Enterprise Products

AI firm Sarvam unveils a new GenAI platform featuring a mix of open source and enterprise products, with support for 10 Indian languages.

StartupTalkyNitin Konde

October 25, 2024