Episode 5: Scraping the Whole Dofus Encyclopedia

Julien Truffaut image

Julien Truffaut

16 October 2025

In the previous episode, we fetched and stored the first 10 amulets from the DofusDB API. Now it’s time to go all in and scrape the entire encyclopedia of Dofus gear.

Pagination: When Things Don’t Quite Work

As a reminder, here’s the function we’ve been using to fetch amulets:

async fn fetch_amulets(skip: u32) -> reqwest::Result<GetObjectsResponse> {
    let url = format!(
        "https://api.dofusdb.fr/items?typeId[$in][]=1&$sort=-id&$skip={}",
        skip
    );
    let resp = reqwest::get(url).await?;
    let data: GetObjectsResponse = resp.json().await?;
    Ok(data)
}

The skip parameter handles pagination. To fetch all amulets, we just need to keep calling this function, increasing skip each time by the number of items already received. Here’s my first attempt:

async fn fetch_all_amulets() -> reqwest::Result<Vec<serde_json::Value>> {
    let mut gears: Vec<serde_json::Value> = vec![];
    let mut skip = 0;

    loop {
        let mut response = fetch_amulets(skip).await?;
        if response.data.is_empty() {
            break;
        } else {
            gears.append(&mut response.data);
            skip += response.data.len() as u32;
        }
    }

    Ok(gears)
}

Looks simple enough… but when I ran it, it never returned. Can you spot the bug?

gears.append(&mut response.data);
skip += response.data.len() as u32;

append moves all elements from response.data into gears, leaving response.data empty! So when I increment skip with response.data.len(), I’m actually adding 0 on every iteration. I’ve spent too much time in immutable-land — order of statements matters again 😅

Fixing the Bug

There are several ways to fix this:

  1. Update skip before appending.
  2. Use extend instead of append (but that clones data unnecessarily).
  3. Drop skip entirely and use gears.len() instead.

Option 3 is the cleanest: Vec::len() is O(1) — it just reads a field from memory. So we can simply use the length of gears as our pagination offset:

async fn fetch_all_amulets() -> reqwest::Result<Vec<serde_json::Value>> {
    let mut gears: Vec<serde_json::Value> = vec![];

    loop {
        let mut response = fetch_amulets(gears.len() as u32).await?;
        if response.data.is_empty() {
            break;
        } else {
            gears.append(&mut response.data);
        }
    }

    Ok(gears)
}

This works perfectly. (And if you know an even more idiomatic Rust way to do this, I’d love to hear it 👀)

Supporting All Gear Types

Fetching all amulets is nice, but we want everything — axes, boots, rings, and so on.

Extending the Enum

enum GearType {
    Amulet,
    Axe,
    Belt,
    Boots,
    Bow,
    // ...
}

And a static list of all gear types:

static ALL_GEAR_TYPES: &[GearType] = &[
    GearType::Amulet,
    GearType::Axe,
    GearType::Belt,
    GearType::Boots,
    GearType::Bow,
    // ...
];

Mapping GearType to DofusDB typeId

Each gear type has a corresponding typeId in the DofusDB API:

fn gear_type_to_type_id(gear_type: &GearType) -> i32 {
    match gear_type {
        GearType::Amulet => 1,
        GearType::Axe    => 19,
        GearType::Belt   => 30,
        GearType::Boots  => 11,
        GearType::Bow    => 2,
        // ...
    }
}

Generalizing the Fetch Logic

We can now replace fetch_amulets with a generic fetch_gear:

async fn fetch_gear(gear_type: &GearType, skip: usize) -> reqwest::Result<GetObjectsResponse> {
    let type_id = gear_type_to_type_id(gear_type);
    let url = format!(
        "https://api.dofusdb.fr/items?typeId[$in][]={}&$sort=-id&$skip={}",
        type_id, skip
    );

    let resp = reqwest::get(url).await?;
    let data: GetObjectsResponse = resp.json().await?;
    Ok(data)
}

async fn fetch_all_gears(gear_type: &GearType) -> reqwest::Result<Vec<serde_json::Value>> {
    // Same loop as fetch_all_amulets, just calling fetch_gear
}

And in main.rs:

for gear_type in ALL_GEAR_TYPES {
    let result = fetch_all_gears(gear_type).await?;
    println!("Imported {} {:?} from DofusDB", result.len(), gear_type);
    save_dofus_db_data(&result, gear_type)?;
}

With a tiny tweak to save_dofus_db_data — making it create a subfolder for each gear type — we can now neatly organize our local data. Here’s what the directory looks like after running cargo run:

dofus_db/data  for d in */; do echo "$(ls -1 "$d" | wc -l) $d"; done | sort -nr
     370 Ring/
     367 Hat/
     364 Boots/
     323 Amulet/
     300 Cloak/
     145 Shield/
     115 Belt/
     111 Sword/

I’m pretty happy with the result. Nothing groundbreaking, but Rust continues to feel clean and expressive for these kinds of tasks. Aside from that small mutation gotcha earlier, everything went smoothly.

Bonus: Fetching in Parallel 🧵

Currently, we loop through each GearType sequentially, but there’s no reason we can’t fetch multiple types in parallel. I used the futures crate for this so I could limit concurrency — after all, DofusDB is a fan-made site, and I don’t want to hammer it with too many simultaneous requests.

[dependencies]
futures = "0.3"
use futures::{stream, StreamExt};

const MAX_CONCURRENCY: usize = 5;

stream::iter(ALL_GEAR_TYPES)
    .for_each_concurrent(MAX_CONCURRENCY, |gear_type| async move {
        if let Err(e) = fetch_and_save_gears(gear_type).await {
            eprintln!("❌ Failed to save {gear_type:?}: {e}");
        } else {
            println!("✅ Finished saving {gear_type:?}");
        }
    })
    .await;

As always, all code from this post is available here.