Laravel-Excel — P7: Chunk Reading

Handle massive datasets with efficient chunk reading

In our last article, we looked at Batch Inserting where we inserted a 100 rows at a time. Chunk reading deals with pulling a certain amount of data into memory. Without chunk reading, we pull the entire sheet into memory. That could be substantial if there is a lot of data inside the excel sheet.

https://blog.devgenius.io/laravel-excel-p6-batch-imports-261c80e46448

How to use Chunk Reading?

I hope that you’re seeing a trend appear: most things are pretty straightforward with Laravel-Excel. All we have to do is implement the WithChunkReading concern and use the chunkSize method.

?php

namespace App\Imports;

use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;

class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading
{
    /**
    * @param array $row
    *
    * @return \Illuminate\Database\Eloquent\Model|null
    */
    public function model(array $row)
    {
        return new User([
            'name'     => $row['first_name'] . " " . $row['last_name'],
            'email'    => $row['email'],
            'password' => Hash::make($row['password']),
        ]);
    }

    public function chunkSize(): int
    {
        return 100;
    }
}

Adding Batch Inserts to Chunk Reading

This is where Chunk Reading really begins to shine. You can read a limited amount of data from a file and then insert it as a chunk. For this example, we’ll read 20 rows and batch insert 20 rows at a time. Our file has 100 rows.

<?php

namespace App\Imports;

use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithBatchInserts;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;

class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading, WithBatchInserts
{
    /**
    * @param array $row
    *
    * @return \Illuminate\Database\Eloquent\Model|null
    */
    public function model(array $row)
    {
        return new User([
            'name'     => $row['first_name'] . " " . $row['last_name'],
            'email'    => $row['email'],
            'password' => Hash::make($row['password']),
        ]);
    }

    public function chunkSize(): int
    {
        return 20;
    }

    public function batchSize(): int
    {
        return 20;
    }
}

Let’s create our method in the UserController class and add our route.

<?php

namespace App\Http\Controllers;

//...
use App\Imports\ChunkImport;
use App\Models\User;
use Illuminate\Http\Request;
use Maatwebsite\Excel\Facades\Excel;

class UserController extends Controller
{
    //...

    public function import_with_chunk_reading() {
        Excel::import(
            new ChunkImport,
            'mock_data/MOCK_DATA_6.csv'
        );

        return redirect('/')->with('success', 'Users Imported Successfully!');
    }
}
<?php

use Illuminate\Support\Facades\Route;
use App\Http\Controllers\UserController;
use Illuminate\Support\Facades\Artisan;

//...

Route::get(
    '/import-with-chunk-reading',
    [UserController::class, 'import_with_chunk_reading']
);

Where are You?

If you need to know which row you’re on, you can use the RemembersRowNumber trait and then just call $this->getRowNumber() to get the current row number. This works on sheets when they’re read in full or with chunks.

<?php

namespace App\Imports;

use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\RemembersRowNumber;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithBatchInserts;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;

class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading, WithBatchInserts
{

    use RemembersRowNumber;

    /**
    * @param array $row
    *
    * @return \Illuminate\Database\Eloquent\Model|null
    */
    public function model(array $row)
    {
        var_dump("Row: " . $this->getRowNumber());

        return new User([
            'name'     => $row['first_name'] . " " . $row['last_name'],
            'email'    => $row['email'],
            'password' => Hash::make($row['password']),
        ]);
    }

    public function chunkSize(): int
    {
        return 20;
    }

    public function batchSize(): int
    {
        return 20;
    }
}

Running this code, and removing the redirect to / that’s in all of the UserController methods, yields the following results:

string(6) "Row: 2" 
string(6) "Row: 3" 
string(6) "Row: 4" 
string(6) "Row: 5"
...
string(6) "Row: 101"

We can also see what the chunk offset is with the RemembersChunkOffset trait. Once used, just call the $this->getChunkOffset() method.

<?php

namespace App\Imports;

use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\RemembersChunkOffset;
use Maatwebsite\Excel\Concerns\RemembersRowNumber;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithBatchInserts;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;

class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading, WithBatchInserts
{

    use RemembersRowNumber, RemembersChunkOffset;

    /**
    * @param array $row
    *
    * @return \Illuminate\Database\Eloquent\Model|null
    */
    public function model(array $row)
    {
        var_dump("Row: " . $this->getRowNumber() . " -- Chunk: " . $this->getChunkOffset());

        return new User([
            'name'     => $row['first_name'] . " " . $row['last_name'],
            'email'    => $row['email'],
            'password' => Hash::make($row['password']),
        ]);
    }

    public function chunkSize(): int
    {
        return 20;
    }

    public function batchSize(): int
    {
        return 20;
    }
}

The results for each var_dump is listed below.

string(18) "Row: 2 -- Chunk: 2" 
string(18) "Row: 3 -- Chunk: 2" 
string(18) "Row: 4 -- Chunk: 2" 
string(18) "Row: 5 -- Chunk: 2" 
string(18) "Row: 6 -- Chunk: 2"
...
string(20) "Row: 99 -- Chunk: 82" 
string(21) "Row: 100 -- Chunk: 82" 
string(21) "Row: 101 -- Chunk: 82"

We’re one step away from making imports pretty awesome, which is queued imports. With Chunk Reading, Queued imports are going to be great. We’ll tackle those next.

 

Laravel Series

Continue your Laravel Learning.

Efficiently process large-scale data with batch imports.

Laravel-Excel — P6: Batch Imports

In the sixth installment of our Laravel-Excel series, discover how to perform large-scale data processing with batch imports. Explore methods for optimizing memory usage, handling parallel jobs, and ensuring robust data handling for massive spreadsheet files.

 

Laravel-Excel — P7: Chunk Reading

Handle massive datasets with efficient chunk reading.

Laravel-Excel — P7: Chunk Reading

In the seventh installment of our Laravel-Excel series, discover how to handle large datasets by reading them in manageable chunks. Learn techniques to optimize memory usage, boost performance, and maintain data integrity for scalable, high-performing applications.

Laravel-Excel — P8: Queued Import

Scale up data imports with queue-based execution.

Laravel-Excel — P8: Queued Import

In the eighth installment of our Laravel-Excel series, discover how to utilize queued imports to handle large and complex data loads without blocking your application. Learn how to combine job management, chunk reading, and parallel processing for fast, reliable data ingestion.

Leave a Reply