2 PB of Huawei Memory Trains a Norwegian AI Model

Norway’s National Library has installed 2 petabytes of Huawei flash memory to train an LLM that understands Norwegian. The project demonstrates that US sanctions have not blocked access to advanced memory infrastructure.

TL;DR: Norway’s National Library is using 2 PB of Huawei flash memory to train a Norwegian-language LLM. The installation includes 122 TB drives built without US-banned components. The project proves that export restrictions did not stop Huawei from delivering petabyte-scale infrastructure.

How Is Norway Using 2 PB of Huawei Flash Memory to Train an LLM?

Norway’s National Library is developing an LLM that understands the Norwegian language, built on its own text collections. The 2-petabyte memory infrastructure from Huawei forms the foundation of the entire project. The installation uses 122 TB drives that Huawei manufactured without using any components covered by US sanctions.

The National Library’s text collections span decades of Norwegian literature, documents, and publications. Training a model on such a large corpus requires fast data access, which justifies choosing flash memory over traditional hard disk drives. Norway is also avoiding dependence on foreign cloud providers.

Why Did Norway Choose Huawei Despite US Sanctions?

Huawei developed its 122 TB drives using QLC NAND memory chips produced by YMTC (Yangtze Memory Technologies Corp). The Chinese manufacturer avoided using technology covered by US export restrictions. As a result, the drives can be sold on international markets without violating sanctions.

US restrictions targeted advanced 3D NAND chips. Huawei and YMTC applied their Xtacking technology, which bonds memory cells and control logic fabricated on separate silicon wafers. This approach circumvents patents protected by US restrictions.

Norway is not bound by US embargoes at the national level. The National Library was able to legally purchase Huawei hardware, choosing a solution that was optimal in terms of both capacity and cost.

What Are the Specs of Huawei’s 122 TB Drive?

Huawei’s 122 TB drive belongs to the OceanDisk family. The device uses an NVMe interface and has been optimized for sequential workloads typical of AI model training. A single drive fits within a standard enterprise form factor.

Comparison of Huawei OceanDisk specs against the competition:

ParameterHuawei OceanDisk 122 TBSamsung PM1743Solidigm D7-P5810
Capacity122 TB15.36 TB12.8 TB
InterfaceNVMePCIe 5.0 NVMePCIe 4.0 NVMe
TechnologyQLC NAND XtackingTLC V-NANDQLC 3D NAND
Sanctions exposureNo US sanctionsStandardStandard

Huawei offers significantly higher capacity density on a single drive. This means fewer devices are needed to reach the 2 PB target. Fewer drives translate to a simplified rack infrastructure.

What Does This Mean for AI Sovereignty in Europe?

The Norwegian project shows that countries can build their own language models without relying on US cloud infrastructure. Norway’s National Library controls the data, the hardware, and the entire training process. This approach strengthens the country’s digital independence.

Interest in national models is growing across Europe. France is developing Mistral, Germany is investing in Aleph Alpha. Norway is going a step further — building memory infrastructure with Huawei, bypassing the dominance of American providers. The project proves that sanctions are not an insurmountable barrier.

Training a local model on national data makes sense. The model will understand Norwegian language, culture, and context better than general-purpose solutions. To understand the scale of the challenge, check out how to train your own LLM from scratch.

How Does Xtacking Technology Bypass US Restrictions?

YMTC’s Xtacking technology involves manufacturing memory cells and the control circuit separately. NAND cells are built on one silicon wafer, the logic on another. The two components are then bonded together in a single package.

US sanctions cover 3D NAND technology above 128 layers. YMTC and Huawei arranged cells horizontally rather than only vertically. This approach increases density without exceeding the layer limit covered by the restrictions.

Huawei has announced Logic Folding technology, which will place control transistors directly beneath memory cells. The solution is expected to be available in approximately five years. The current 122 TB drives already take advantage of available sanctions workarounds.

Read more about Huawei’s technology and sanctions workarounds at Benchmark.pl and ITHardware.pl.

What Data Goes Into the Norwegian LLM?

Norway’s National Library collects materials including books, newspapers, official documents, and academic publications. The text corpus forms the basis for training a model that understands the specifics of the Norwegian language, including dialects and archaic forms.

The library is digitizing materials dating back to the Middle Ages. The LLM is expected to process queries in both historical and contemporary language. 2 PB of flash memory allows the library to store digital versions of collections with fast access during training.

The project involves building a bilingual model — bokmal and nynorsk — the two official written forms of the Norwegian language. This approach requires a doubled training corpus. Full details about the project are available at Blocks & Files.

Why Flash Memory Instead of Hard Drives?

Training LLMs requires sequential reading of massive datasets. Flash memory offers throughput many times higher than HDDs. With 2 PB of data, the difference in training time can range from several days to several weeks.

HDDs with 20-22 TB capacities cost less per gigabyte. However, they require more racks, more power, and offer lower performance. Huawei’s 122 TB flash drives reduce the physical footprint of the entire installation.

Electricity costs in Norway are relatively low thanks to hydroelectric power. Even so, fewer drives mean simpler cooling and management. Choosing flash makes economic sense at this scale. Similar analyses are discussed in the article about whether local LLMs are ready to offload computing infrastructure.

What Benefits Does Flash Memory Provide When Training an LLM on 2 PB of Data?

Huawei OceanDisk flash drives achieve 122 TB capacity per device, significantly reducing data access time compared to HDDs. The 2-petabyte flash installation at Norway’s National Library enables sequential reading of the massive text corpus without the bottlenecks characteristic of mechanical drives. Fewer drives also simplify the rack architecture.

The National Library’s collections span decades of Norwegian literature and documents. Fast access to these materials is critical for efficient processing. Training an LLM requires multiple passes through the same corpus. Flash memory guarantees stable throughput across successive training epochs.

The Norwegian project relies on its own infrastructure, with no dependence on public clouds. Full control over hardware and data is a key advantage of this approach. Similar performance challenges are discussed in the article about how MegaTrain: Full-Precision Training of LLMs with Over 100 Billion Parameters on a Single GPU handles hardware limitations.

What Does the Computing Infrastructure Supporting 2 PB of Flash Look Like?

Installing 2 PB of Huawei flash memory requires adequate computing infrastructure to process data at that throughput. Norway’s National Library connected OceanDisk drives to servers equipped with GPU accelerators, ensuring continuous processing. The entire setup runs on a local cluster, bypassing external cloud services.

Huawei’s memory infrastructure uses the NVMe interface, which minimizes data access latency. This architecture is essential for sequentially reading 2 PB of text during model training. GPUs process data in parallel, while flash memory feeds it at the required speed.

Key components of the Norwegian project’s infrastructure:

  • Huawei OceanDisk drives with 122 TB capacity and NVMe interface
  • GPU accelerators for tensor computations
  • Local server cluster with no public cloud dependencies
  • Power from Norwegian hydroelectric plants
  • Cooling system optimized for a reduced drive count
  • High-throughput network interface connecting storage to servers
  • Redundant architecture protecting against training data loss
  • Software managing training task scheduling

Detailed information about the installation architecture is available at Blocks & Files.

How Does QLC NAND Technology Affect Training Performance?

Huawei OceanDisk drives use YMTC-produced QLC NAND chips that store 4 bits per cell. This technology enables 122 TB capacity on a single drive, which is critical for fitting 2 PB into a minimal number of devices. QLC NAND offers lower endurance compared to TLC, but in an LLM training scenario, sequential reads matter most, not intensive writes.

QLC chips combined with Xtacking technology bypass US patents. YMTC produces memory cells and control circuits on separate silicon wafers. This approach allows Huawei to offer drives with capacities unavailable to Western competitors.

Samsung PM1743 drives max out at 15.36 TB. Huawei offers 122 TB on a single device. The capacity density difference comes from using QLC chips and the Xtacking architecture.

Read more about manufacturing technology that circumvents US sanctions at ITHardware.pl.

What Does the Norwegian Project Mean for Other European Countries?

Norway’s National Library proves that building a national LLM is possible without dependence on US cloud providers. The 2 PB Huawei flash installation shows an alternative path for countries seeking digital sovereignty. The project sets a precedent for other European institutions.

Norway is taking a different route — relying on Huawei hardware and its own data. The bilingual model (bokmal and nynorsk) trained on national collections will better understand local context than general-purpose solutions.

Europe is looking for ways to reduce dependence on US dominance in AI. The Norwegian project shows that sanctions do not block access to advanced infrastructure. Huawei delivered 2 PB of flash legally, without violating export restrictions. This opens possibilities for other countries.

Similar independence efforts are discussed in the article about OpenClaw and the End of the AI Monopoly Era: Will LLMs Become a Commodity?.

What Are the Costs of Building AI Infrastructure with Huawei?

The exact costs of the 2 PB Huawei flash installation at Norway’s National Library have not been disclosed. However, the scale of expenditure can be estimated based on capacity and drive count. At 122 TB per device, reaching 2 PB requires approximately 17 OceanDisk drives, which significantly reduces rack and cooling costs compared to installations using lower-capacity drives.

Cost comparison of different AI storage approaches:

ApproachCapacity per DriveDrives for 2 PBEstimated Rack Footprint
Huawei OceanDisk122 TB~17Minimal
Enterprise SSD15.36 TB~131Medium
HDD22 TB~91Large

Fewer devices mean lower operational costs. Norway additionally benefits from cheap hydroelectric power, which reduces cooling expenses. The Huawei installation is cheaper to maintain than a comparable cluster built on Western drives.

Read more about the 122 TB drive and its specs at ITHardware.pl.

How Does the Bilingual Model Affect Storage Architecture?

Norway’s National Library is training a model that supports two official written forms of the language: bokmal and nynorsk. Each requires a separate text corpus, doubling the storage demand. The 2 PB Huawei flash memory accommodates both collections with room for metadata and model checkpoints.

Bokmal is used by approximately 90% of Norway’s population. Nynorsk is rarer but equally important culturally. The model must process both forms with equal accuracy. Bilingualism increases training complexity, requiring careful balancing of training data.

LLM checkpoints take up significant space. With 2 PB of flash, the library has enough room for regular training state saves. This protects against losing progress in case of hardware failure. Cache management during training is discussed in the article about how Anthropic Shortened Cache TTL on March 6.

What Are the Prospects for Huawei’s Technology Development?

The current 122 TB drives are just the beginning of what the Xtacking architecture can deliver.

Logic Folding technology will allow Huawei to bypass future generations of US sanctions. Arranging cells both horizontally and vertically increases capacity without exceeding the layer limits covered by restrictions. The Chinese manufacturer is investing in developing its own technologies independent of Western patents.

The 122 TB drives use QLC NAND chips from YMTC. Future generations may transition to five-bit-per-cell (PLC) technology, further increasing capacity. Huawei plans to offer drives above 200 TB within a few years. Details about Huawei’s plans are available at Benchmark.pl.

How Does Norway Protect Training Data on 2 PB of Flash?

Norway’s National Library stores decades of Norwegian literature, official documents, and academic publications on 2 PB of Huawei flash memory. The installation leverages redundancy mechanisms built into the OceanDisk architecture, protecting unique cultural collections from loss. The LLM trained on this data will process queries in both historical and contemporary language.

The training data includes materials dating back to the Middle Ages. Digitizing and storing them requires reliable storage. Huawei flash memory offers a lower read error rate than HDDs, which is critical for maintaining corpus integrity.

Key data protection elements in the Norwegian project:

  • Disk array-level redundancy
  • Regular training state checkpoints
  • Backup of the text corpus on separate media
  • OceanDisk health monitoring
  • Physical security for the National Library’s server room
  • Access controls for the training infrastructure
  • Data integrity validation after each training epoch
  • Disaster recovery procedures defined before project launch

Full details about the installation are available at Blocks & Files.

Frequently Asked Questions

How many Huawei OceanDisk drives are needed to reach 2 PB of capacity?

At 122 TB per drive, reaching 2 PB requires approximately 17 Huawei OceanDisk devices (source: ITHardware.pl) — significantly fewer than the 131 Samsung PM1743 drives at 15.36 TB each that would be needed for the same capacity.

Does the Huawei installation in Norway violate US sanctions?

No, Huawei OceanDisk 122 TB drives use YMTC QLC NAND chips with Xtacking technology, which does not contain components covered by US export restrictions (source: ITHardware.pl) — Norway can legally purchase this hardware.

Why doesn’t Norway’s National Library use the cloud for LLM training?

Norway’s National Library chose local infrastructure with 2 PB of Huawei flash memory to maintain full control over national text collections and avoid dependence on foreign cloud providers (source: Blocks & Files) — this approach strengthens the country’s digital sovereignty.

What languages does the Norwegian LLM support?

The model trained by Norway’s National Library supports two official written forms of Norwegian: bokmal (used by approximately 90% of the population) and nynorsk (source: Blocks & Files) — bilingual support requires a doubled training corpus.

Summary

The project at Norway’s National Library with 2 PB of Huawei flash memory yields several important takeaways for the European AI ecosystem:

  • US sanctions did not block access to advanced memory infrastructure — Huawei delivered 122 TB drives legally, bypassing restrictions with Xtacking technology
  • Digital sovereignty is achievable — countries can train their own LLMs on national data without depending on public clouds
  • Flash memory in 122 TB format drastically reduces the physical footprint — 2 PB fits on approximately 17 drives
  • Europe has an alternative to US provider dominance — the Norwegian project sets a precedent for other countries seeking independence
  • National models better understand local context — a bilingual Norwegian model will process queries inaccessible to general-purpose solutions

Want to learn more about training your own language models? Read the guide on how to train your own LLM from scratch and start building solutions tailored to your needs. If you prefer ready-made tools, check out how Lemonade from AMD: A Fast, Open Local LLM Server Using GPU and NPU simplifies running models on local hardware.