Handle massive datasets with efficient chunk reading
In our last article, we looked at Batch Inserting where we inserted a 100 rows at a time. Chunk reading deals with pulling a certain amount of data into memory. Without chunk reading, we pull the entire sheet into memory. That could be substantial if there is a lot of data inside the excel sheet.
https://blog.devgenius.io/laravel-excel-p6-batch-imports-261c80e46448
How to use Chunk Reading?
I hope that you’re seeing a trend appear: most things are pretty straightforward with Laravel-Excel. All we have to do is implement the WithChunkReading
concern and use the chunkSize
method.
?php
namespace App\Imports;
use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;
class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading
{
/**
* @param array $row
*
* @return \Illuminate\Database\Eloquent\Model|null
*/
public function model(array $row)
{
return new User([
'name' => $row['first_name'] . " " . $row['last_name'],
'email' => $row['email'],
'password' => Hash::make($row['password']),
]);
}
public function chunkSize(): int
{
return 100;
}
}
Adding Batch Inserts to Chunk Reading
This is where Chunk Reading really begins to shine. You can read a limited amount of data from a file and then insert it as a chunk. For this example, we’ll read 20 rows and batch insert 20 rows at a time. Our file has 100 rows.
<?php
namespace App\Imports;
use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithBatchInserts;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;
class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading, WithBatchInserts
{
/**
* @param array $row
*
* @return \Illuminate\Database\Eloquent\Model|null
*/
public function model(array $row)
{
return new User([
'name' => $row['first_name'] . " " . $row['last_name'],
'email' => $row['email'],
'password' => Hash::make($row['password']),
]);
}
public function chunkSize(): int
{
return 20;
}
public function batchSize(): int
{
return 20;
}
}
Let’s create our method in the UserController
class and add our route.
<?php
namespace App\Http\Controllers;
//...
use App\Imports\ChunkImport;
use App\Models\User;
use Illuminate\Http\Request;
use Maatwebsite\Excel\Facades\Excel;
class UserController extends Controller
{
//...
public function import_with_chunk_reading() {
Excel::import(
new ChunkImport,
'mock_data/MOCK_DATA_6.csv'
);
return redirect('/')->with('success', 'Users Imported Successfully!');
}
}
<?php
use Illuminate\Support\Facades\Route;
use App\Http\Controllers\UserController;
use Illuminate\Support\Facades\Artisan;
//...
Route::get(
'/import-with-chunk-reading',
[UserController::class, 'import_with_chunk_reading']
);
Where are You?
If you need to know which row you’re on, you can use
the RemembersRowNumber
trait and then just call $this->getRowNumber()
to get the current row number. This works on sheets when they’re read in full or with chunks.
<?php
namespace App\Imports;
use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\RemembersRowNumber;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithBatchInserts;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;
class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading, WithBatchInserts
{
use RemembersRowNumber;
/**
* @param array $row
*
* @return \Illuminate\Database\Eloquent\Model|null
*/
public function model(array $row)
{
var_dump("Row: " . $this->getRowNumber());
return new User([
'name' => $row['first_name'] . " " . $row['last_name'],
'email' => $row['email'],
'password' => Hash::make($row['password']),
]);
}
public function chunkSize(): int
{
return 20;
}
public function batchSize(): int
{
return 20;
}
}
Running this code, and removing the redirect to /
that’s in all of the UserController
methods, yields the following results:
string(6) "Row: 2"
string(6) "Row: 3"
string(6) "Row: 4"
string(6) "Row: 5"
...
string(6) "Row: 101"
We can also see what the chunk offset is with the RemembersChunkOffset
trait. Once used, just call the $this->getChunkOffset()
method.
<?php
namespace App\Imports;
use App\Models\User;
use Illuminate\Support\Facades\Hash;
use Maatwebsite\Excel\Concerns\RemembersChunkOffset;
use Maatwebsite\Excel\Concerns\RemembersRowNumber;
use Maatwebsite\Excel\Concerns\ToModel;
use Maatwebsite\Excel\Concerns\WithBatchInserts;
use Maatwebsite\Excel\Concerns\WithChunkReading;
use Maatwebsite\Excel\Concerns\WithHeadingRow;
class ChunkImport implements ToModel, WithHeadingRow, WithChunkReading, WithBatchInserts
{
use RemembersRowNumber, RemembersChunkOffset;
/**
* @param array $row
*
* @return \Illuminate\Database\Eloquent\Model|null
*/
public function model(array $row)
{
var_dump("Row: " . $this->getRowNumber() . " -- Chunk: " . $this->getChunkOffset());
return new User([
'name' => $row['first_name'] . " " . $row['last_name'],
'email' => $row['email'],
'password' => Hash::make($row['password']),
]);
}
public function chunkSize(): int
{
return 20;
}
public function batchSize(): int
{
return 20;
}
}
The results for each var_dump
is listed below.
string(18) "Row: 2 -- Chunk: 2"
string(18) "Row: 3 -- Chunk: 2"
string(18) "Row: 4 -- Chunk: 2"
string(18) "Row: 5 -- Chunk: 2"
string(18) "Row: 6 -- Chunk: 2"
...
string(20) "Row: 99 -- Chunk: 82"
string(21) "Row: 100 -- Chunk: 82"
string(21) "Row: 101 -- Chunk: 82"
We’re one step away from making imports pretty awesome, which is queued imports. With Chunk Reading, Queued imports are going to be great. We’ll tackle those next.
Laravel Series
Continue your Laravel Learning.
Efficiently process large-scale data with batch imports.
Laravel-Excel — P6: Batch Imports
In the sixth installment of our Laravel-Excel series, discover how to perform large-scale data processing with batch imports. Explore methods for optimizing memory usage, handling parallel jobs, and ensuring robust data handling for massive spreadsheet files.
Handle massive datasets with efficient chunk reading.
Laravel-Excel — P7: Chunk Reading
In the seventh installment of our Laravel-Excel series, discover how to handle large datasets by reading them in manageable chunks. Learn techniques to optimize memory usage, boost performance, and maintain data integrity for scalable, high-performing applications.
Scale up data imports with queue-based execution.
Laravel-Excel — P8: Queued Import
In the eighth installment of our Laravel-Excel series, discover how to utilize queued imports to handle large and complex data loads without blocking your application. Learn how to combine job management, chunk reading, and parallel processing for fast, reliable data ingestion.