Andre Bandarra's Blog

Heterogeneous collections in Rust

In some occasions, when programming software, developers run into the need of heterogenous collections - that is, a collection that can store objects of different types. In Rust, there are different ways a developer can achieve that, with different tradeoffs. This article will look into a few different ways to achieve this.

Using Enums

Rust enums are a great way to achieve this. Provided that all implementations of the objects to be store are known at development time, developers can create an enum that wraps each possible type, then create a collection for those enums.

Then, to access the methods and attributes of the inner class, a match expression can be used to retrieve the inner object.

enum ComponentType {
FirstComponent(MyFirstComponent),
SecondComponent(MySecondComponent),
}

struct MyFirstComponent {
}

impl MyFirstComponent {
fn do_first_component_thing(&self) {
println!("First Component");
}
}

struct MySecondComponent {
}

impl MySecondComponent {
fn do_second_component_thing(&self) {
println!("Second Component");
}
}

fn main() {
// Create a collection of enums;
let mut components: Vec<ComponentType> = Vec::new();

// Add the enums to the collection, wrapping the target type.
components.push(ComponentType::FirstComponent(MyFirstComponent {}));
components.push(ComponentType::SecondComponent(MySecondComponent {}));

// Use match expressions to retrieve the object from the enum and access methods and attributes.
if let ComponentType::FirstComponent(component) = &components[0] {
component.do_first_component_thing();
}
}

An advantage of this method is that the implementation is quite simple and idiomatic and, when used inside an Array, for instance, it will allocate all objects on the stack (the Vector used in this example will allocate on the heap, though).

On the other hand, a challenge with this approach is that those component types need to be known when writing the code. Think about a library that needs to store objects from different types, but those are only known by the user of that library.

Using Traits

An alternative is using traits as alternate solution, where:


// Declare a trait with common behaviour.
trait Component {
fn do_component_thing(&self);
}

struct MyFirstComponent {}

// Implement the trait for each type.
impl Component for MyFirstComponent {
fn do_component_thing(&self) {
println!("First Component");
}
}

struct MySecondComponent {}

impl Component for MySecondComponent {
fn do_component_thing(&self) {
println!("Second Component");
}
}

fn main() {
let mut components: Vec<Box<dyn Component>> = Vec::new();
components.push(Box::new(MyFirstComponent { }));
components.push(Box::new(MySecondComponent { }));

components[0].do_component_thing();
components[1].do_component_thing();
}

This approach works well when it's only necessary to access the common method in all traits. A disadvantage of this approach is that elements will always be allocated on the heap, and another disadvantage is that it's only possible to access common methods.

Using Any

The Rust documentation describes the Any type as A trait to emulate dynamic typing.. It provides a downcast method, which allows typecasting to different types.

use std::any::Any;

struct MyFirstComponent {
}

impl MyFirstComponent {
fn do_first_component_thing(&self) {
println!("First Component");
}
}

struct MySecondComponent {
}

impl MySecondComponent {
fn do_second_component_thing(&self) {
println!("Second Component");
}
}

fn main() {
let mut components: Vec<Box<dyn Any>> = Vec::new();
components.push(Box::new(MyFirstComponent {}));
components.push(Box::new(MySecondComponent {}));

if let Some(component) =
components[0].downcast_ref::<MyFirstComponent>() {
component.do_first_component_thing();
}

if let Some(component) =
components[1].downcast_ref::<MySecondComponent>() {
component.do_second_component_thing();
}
}

While this will still always allocate objects on the heap, it's now possible to have different component types inside the data structure, cast them to original types and access component specific attributes and methods.

There's one small issue, though - there is no bound to which types can be added to the structure and the line below wokis just fine:

components.push(Box::new("I shouldn't be here").to_string());

Mixing Any and Traits

Any can be used along with Traits to create bounds for the object. The trick is to add a method to the trait that converts the object to Any, which will then be downcasted to other objects. Each structure will then have to implement the trait, and the conversion method:

use std::any::Any;

trait Component {
fn as_any(&self) -> &dyn Any;
}

struct MyFirstComponent {
}

impl MyFirstComponent {
fn do_first_component_thing(&self) {
println!("First Component");
}
}

impl Component for MyFirstComponent {
fn as_any(&self) -> &dyn Any {
self
}
}

struct MySecondComponent {
}

impl MySecondComponent {
fn do_second_component_thing(&self) {
println!("Second Component");
}
}

impl Component for MySecondComponent {
fn as_any(&self) -> &dyn Any {
self
}
}

fn main() {
let mut components: Vec<Box<dyn Component>> = Vec::new();
components.push(Box::new(MyFirstComponent {}));
components.push(Box::new(MySecondComponent {}));

if let Some(component) =
components[0].as_any().downcast_ref::<MyFirstComponent>() {
component.do_first_component_thing();
}

if let Some(component) =
components[1].as_any().downcast_ref::<MySecondComponent>() {
component.do_second_component_thing();
}
}

While, again, this will still allocate objects on the heap, object specific methods and attributes can be used with a downcast, and the collection is bound to objects that implement that trait. One big downside is having to implement the trait for each object, which is just boilerplate.

Using proc-macro-derive to avoid boilerplate

A solution to the boilerplate using using a procedural macro to implement the boiler plate:


// A derive macro needs to live in its own crate.
#[proc_macro_derive(Component)]
pub fn component_macro_derive(input: TokenStream) -> TokenStream {
let ast: DeriveInput = syn::parse(input).unwrap();
let name = &ast.ident;
let gen = quote! {
impl Component for #name {
fn as_any(&self) -> &dyn Any {
self
}
}
};
gen.into()
}

// The component still lives in the project file.
#[derive(Component)]
struct MyFirstComponent {
}

impl MyFirstComponent {
fn do_first_component_thing(&self) {
println!("First Component");
}
}

Conclusion

There are different ways to implement heterogenous collectionos in Rust. While the enums approach seems to be considered the most idiomatic, it's not always possible to be used. In those cases, different approaches are available, with their own tradeoffs.

← Home