Rio de Janeiro's 'Homegrown' LLM Appears to Be a Merge of an Existing Model — AI article on gikiewicz.com

IplanRIO, Rio de Janeiro’s municipal IT company, released a 397-billion-parameter language model called Rio 3.5 Open. The model is built from the Qwen 3.5 397B base and is available on Hugging Face, where it edges out several well-known competitors on benchmark scores (Digg). But the AI community quickly noticed something odd about the model’s origins.

TL;DR: IplanRIO, Rio de Janeiro’s municipal IT company, released Rio 3.5 Open, a 397B-parameter model built from the Qwen 3.5 397B base. The model is available on Hugging Face, where it edges out competitors (Digg). However, multiple researchers found evidence suggesting the model is a merge of existing weights rather than a fully independent training effort.

What Is the Rio 3.5 Open LLM?

Rio 3.5 Open is a large language model developed by IplanRIO, the municipal IT company serving Rio de Janeiro. The model features 397 billion parameters and was built using the Qwen 3.5 397B architecture as its foundation. IplanRIO published the weights on Hugging Face under an open-weights license, making the model freely downloadable for research and commercial use (Digg).

The release was positioned as a locally developed AI resource. Municipal governments rarely produce models at this scale. A 397B-parameter model requires substantial GPU infrastructure and training expertise. The choice of Qwen 3.5 397B as a base is itself notable — Alibaba’s Qwen series has gained traction as a strong open-weights foundation, and building on top of it reduces the cost of pretraining from scratch.

IplanRIO described the project as a post-training effort. This means the organization took the existing Qwen base model and applied additional training steps — fine-tuning, alignment, or instruction tuning — to adapt it for Portuguese-language tasks and municipal use cases. The model’s Hugging Face page highlights benchmark results where it outperforms several similarly sized competitors.

But the AI research community began asking questions almost immediately.

How Does the Model Actually Work?

The model operates as a post-trained variant of Qwen 3.5 397B, meaning IplanRIO started with pretrained weights and applied additional optimization rather than training from scratch. Post-training typically involves supervised fine-tuning on curated datasets, followed by reinforcement learning from human feedback or similar alignment techniques. The goal is to specialize the model for specific tasks or languages without repeating the expensive pretraining phase.

For a municipal IT company, this approach makes practical sense. Training a 397B-parameter model from scratch would cost millions of dollars in compute alone. Post-training on an existing base reduces that cost dramatically while still allowing meaningful customization. The model can be adapted for Portuguese-language government services, citizen interaction systems, or internal administrative tools.

The technical details matter here. When an organization post-trains a model, the resulting weights should show measurable differences from the base model in the fine-tuned layers. Token embeddings, attention patterns, and output distributions should shift to reflect the new training data. If the differences are minimal or absent, the post-training claim becomes questionable.

This is exactly what researchers examined.

What Evidence Suggests It Is a Merge?

Several members of the AI community analyzed Rio 3.5 Open’s weights and found indicators consistent with a model merge rather than independent post-training. A model merge combines weights from multiple existing models using mathematical techniques like spherical linear interpolation or task vector arithmetic. The resulting model inherits properties from its parent models without requiring new training data.

The evidence includes weight distribution patterns that closely match known merge signatures. When researchers compared Rio 3.5 Open’s internal representations to existing models on Hugging Face, they found overlaps that would be statistically improbable from independent fine-tuning alone. The model’s behavior on specific prompts also mirrored characteristics of other publicly available models.

Additionally, the speed of release raised questions. IplanRIO announced and published the model within a timeframe that some researchers considered too short for a thorough post-training cycle on a 397B-parameter architecture. Quality post-training at this scale typically requires weeks of iteration, evaluation, and refinement.

Model merging itself is not inherently deceptive. Many legitimate open-source projects use merging techniques to combine strengths of different models. The concern arises when a merged model is presented as an independent development effort.

How Does It Perform on Benchmarks?

According to the Digg report, Rio 3.5 Open edges out several competitors on benchmarks where it has been evaluated. The model’s Hugging Face model card lists performance metrics that place it competitively among open-weights models in the 300B-plus parameter range. These results are consistent with what merged models can achieve — well-executed merges often match or exceed their parent models on standard benchmarks.

Benchmark performance alone does not validate or invalidate the development methodology. A merged model can legitimately outperform independently trained models if the merge combines complementary strengths effectively. The AI community has documented numerous cases where merged models achieve state-of-the-art results on leaderboards.

However, benchmark transparency becomes critical when evaluating claims about a model’s origins. If Rio 3.5 Open’s strong performance stems primarily from inheriting capabilities of existing models through merging, that context should accompany the results. Independent post-training would represent a different contribution to the open-source ecosystem than a skillfully executed merge.

The distinction matters for reproducibility. Other researchers seeking to build on IplanRIO’s work need accurate information about methodology to understand what they are working with.

Why Did IplanRIO Build This Model?

IplanRIO, Rio de Janeiro’s municipal IT company, developed Rio 3.5 Open as part of a broader initiative to deploy AI tools tailored to Portuguese-language public administration tasks. The model was built from the Qwen 3.5 397B base, a model originally developed by Alibaba’s Qwen team. IplanRIO applied post-training techniques to adapt the model for local government use cases.

The motivation appears to be twofold. First, Brazilian government agencies need language models that perform well in Portuguese, particularly Brazilian Portuguese with its regional vocabulary and administrative terminology. Second, relying on foreign API-based services creates data sovereignty concerns for government workloads. A locally hosted model gives the municipality full control over inference and data handling.

Municipal IT departments across Brazil have been exploring similar strategies. The Brazilian federal government launched its own AI plan in 2024, emphasizing sovereign AI capabilities. Rio’s effort fits within that national push for technological autonomy.

However, calling the model “homegrown” has drawn criticism. The post-training layer added by IplanRIO sits on top of an existing foundation model from a Chinese tech company. This raises questions about what constitutes a genuinely local AI product.

What Are the Licensing Implications?

The Qwen 3.5 397B base model is released under the Apache 2.0 license, which permits commercial use, modification, and redistribution with minimal restrictions. This means IplanRIO’s post-trained derivative is legally permissible under the terms of the original license. Apache 2.0 requires attribution and inclusion of the license notice, but does not impose copyleft requirements.

Rio 3.5 Open was published on Hugging Face, making the weights publicly available for download and further modification. The Apache 2.0 license allows this type of redistribution without requiring the derivative work to adopt the same license. This is a significant difference from GPL-style licenses that mandate source availability for all derivatives.

However, the licensing situation becomes more nuanced when considering attribution norms in the open weights community. While not legally required, crediting the base model creator is standard practice. The controversy around Rio 3.5 Open stems partly from how the model was presented publicly.

If a government entity markets a model as “homegrown” or “nationally developed” without clearly disclosing its relationship to the Qwen base, this creates tension with open source community expectations. Legal compliance and ethical attribution are separate concerns.

The table below summarizes key licensing details:

AspectDetails
Base ModelQwen 3.5 397B
Base LicenseApache 2.0
Commercial UsePermitted
Attribution RequiredYes (license notice)
Copyleft ObligationsNone
Derivative DistributionAllowed

Open weights models like Qwen have enabled a wide ecosystem of derivatives. The licensing framework supports this, but transparency about provenance remains a community norm that carries weight beyond legal requirements.

How Did the Community React to the Release?

The AI community’s reaction to Rio 3.5 Open was swift and critical. Researchers and developers on social media quickly identified the model’s connection to Qwen 3.5 397B, noting that the post-training modifications did not constitute an independently developed foundation model. The term “merge” appeared frequently in discussions, referring to the practice of combining existing model weights rather than training from scratch.

Several developers pointed out that the model’s performance characteristics closely matched those of the Qwen base. This is expected for a post-trained derivative, but the marketing language used by IplanRIO emphasized Brazilian innovation and local development. The gap between marketing claims and technical reality fueled the backlash.

The open weights community has established norms around model provenance. Platforms like Hugging Face encourage clear documentation of base models, training data, and modification techniques. When these details are incomplete or buried, community members typically flag the omission.

Critics also raised concerns about resource allocation. If public funds were spent developing a model that is essentially a fine-tuned version of an existing open weights release, questions arise about value for money. Training a genuinely new foundation model requires enormous compute resources, but post-training is significantly cheaper.

Supporters of the project argued that post-training a model for specific local needs is a legitimate and practical approach. Many government AI initiatives worldwide use existing foundation models as starting points. The debate ultimately centers on transparency rather than technical methodology.

What Does This Mean for Open Weights AI?

The Rio 3.5 Open case highlights a growing tension in the open weights ecosystem. As more organizations release post-trained derivatives of existing models, the line between original development and repackaging becomes blurred. The open weights community depends on clear provenance documentation to maintain trust.

This situation is not unique to Rio. The Hugging Face platform hosts thousands of models that are fine-tuned, merged, or quantized versions of a small number of base models. Qwen, Llama, and Mistral derivatives make up a significant portion of available models. Most of these clearly document their relationship to the base model.

The controversy serves as a reminder that open weights licenses enable derivative works but do not automatically ensure ethical attribution practices. Organizations releasing models have a responsibility to be transparent about their development process, especially when public funding is involved.

For the broader AI ecosystem, this case underscores the importance of model cards and documentation. A well-written model card should include the base model, training methodology, datasets used, and intended applications. When any of these elements are missing, the community fills the gap with speculation.

Open weights AI has thrived because of shared norms around credit and transparency. Cases like Rio 3.5 Open test those norms and push the community to articulate standards more explicitly.

Can Other Cities Replicate This Strategy?

Technically, yes. Post-training an existing open weights model requires far fewer resources than training a foundation model from scratch. Any municipality with moderate compute infrastructure can fine-tune a model like Qwen 3.5 for local language and administrative tasks. The Apache 2.0 license explicitly permits this type of use.

However, replication comes with caveats. Organizations should be transparent about their development process from the start. Clearly documenting the base model, training data, and modification techniques prevents the kind of backlash that Rio 3.5 Open received. Honesty about what was built matters.

Several cities and national governments have launched similar initiatives:

  • Abu Dhabi developed the Falcon model series through the Technology Innovation Institute
  • Singapore government agencies have deployed fine-tuned models for public services
  • France supported the development of Mistral models through public investment
  • South Korea funded the Naver HyperCLOVA X model for Korean-language tasks
  • India announced plans for sovereign AI models through the IndiaAI mission
  • Indonesia explored local language models through government-academic partnerships
  • Saudi Arabia invested in Arabic-language models via the SDAIA initiative
  • Spain Catalonia region deployed open source models for Catalan language support

The key difference between successful initiatives and controversial ones often comes down to communication. Projects that clearly state their relationship to existing foundation models tend to receive community support. Those that overstate originality face criticism regardless of their technical merits.

For cities considering a similar path, the lesson is straightforward. Use open weights models as building blocks. Document everything. Credit the original creators. Be honest about what was added versus what was inherited.

Frequently Asked Questions

Is Rio 3.5 Open a completely new model?

No. Rio 3.5 Open is a post-trained derivative of Qwen 3.5 397B, not an independently trained foundation model. IplanRIO applied post-training techniques to adapt the existing Qwen base for Brazilian Portuguese and municipal government applications.

What base model was used for Rio 3.5 Open?

The base model is Qwen 3.5 397B, developed by Alibaba’s Qwen team. IplanRIO built Rio 3.5 Open on top of this foundation using post-training methods. The Qwen 3.5 397B model is available under the Apache 2.0 license.

How many parameters does the Rio model have?

Rio 3.5 Open has 397 billion parameters, inherited directly from the Qwen 3.5 397B base model. Post-training typically does not change the parameter count of the underlying architecture. The model was published on Hugging Face with publicly downloadable weights.

Where can developers access the model weights?

The model weights are available on Hugging Face, where IplanRIO published Rio 3.5 Open for public download. Developers can access the weights, inspect the model card, and download the files for local inference or further modification under the terms of the Apache 2.0 license.

Summary

  • Rio 3.5 Open is a post-trained derivative of Qwen 3.5 397B, not an independently developed foundation model
  • The Apache 2.0 license legally permits this type of derivative work, including commercial use and redistribution
  • Community backlash focused on transparency and attribution rather than the technical approach itself
  • Clear documentation of base models, training data, and methodology is essential for maintaining trust in the open weights ecosystem
  • Other cities can replicate this strategy, but transparency about provenance is critical from the outset

If you found this analysis useful, subscribe to my newsletter for more deep dives into AI, programming, and security topics. I cover the stories that matter with technical detail you will not find elsewhere.