Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue #29

Open
dansleboby opened this issue Dec 11, 2024 · 9 comments
Open

Memory issue #29

dansleboby opened this issue Dec 11, 2024 · 9 comments

Comments

@dansleboby
Copy link

Hi,

Here is my test script:

<?php

require 'vendor/autoload.php';

use lsolesen\pel\PelJpeg;

function bytes($size) {
	$unit = ['b', 'kb', 'mb', 'gb', 'tb', 'pb'];
	return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}

$files = glob("dsc*.{JPG,jpeg}", GLOB_BRACE);

foreach($files as $imagePath) {
	echo "Processing file: " . $imagePath . PHP_EOL;
	try {
		$jpeg = new PelJpeg($imagePath);
	} catch(Exception $e) {
		echo 'Error: ' . $e->getMessage() . PHP_EOL;
	}
	printf("MEM:  %s\nPEAK: %s\n", bytes(memory_get_usage()), bytes(memory_get_peak_usage()));
}
The profile will be stored in your Personal environment. The "--environment" option can be used to specify the target environment.
Processing file: dsc00004_1700085011.JPG
MEM:  18.02 mb
PEAK: 18.05 mb
Processing file: dsc00005_1700085017.JPG
MEM:  34.11 mb
PEAK: 34.12 mb
Processing file: dsc00006_1700085022.JPG
MEM:  50.2 mb
PEAK: 50.21 mb
Processing file: dsc00010_1731530648.JPG
MEM:  66.29 mb
PEAK: 66.3 mb
Processing file: dsc00011_1731530660.JPG
MEM:  82.38 mb
PEAK: 82.39 mb
Processing file: dsc00012_1731530667.JPG
MEM:  98.47 mb
PEAK: 98.48 mb

Blackfire run completed
Graph                 https://blackfire.io/profiles/f3f67423-90f3-4687-85b2-f17b041b4b83/graph
Timeline              https://blackfire.io/profiles/f3f67423-90f3-4687-85b2-f17b041b4b83/graph?settings%5Bdimension%5D=timeline
No tests!             Create some now https://docs.blackfire.io/testing-cookbooks/tests
No recommendations

CPU Time     22.6ms
Memory       97.5MB
Wall Time    54.1ms
I/O Wait     31.5ms
Network         n/a     n/a     n/a
SQL             n/a     n/a

I also try to add

$jpeg = null;
gc_collect_cycles();

But it change nothing

I think the issue is coming from the clone class, but I'm not shure:

$c = clone $this;

Test env ( also test : (8.4.1, 8.2.26) )

PHP 8.2.14 (cli) (built: Dec 20 2023 10:19:26) (NTS Visual C++ 2019 x64)
Copyright (c) The PHP Group
Zend Engine v4.2.14, Copyright (c) Zend Technologies
    with blackfire v1.92.30~win-x64-non_zts82, https://blackfire.io, by Blackfire

Microsoft Windows 11 Pro 10.0.22631 build 22631

Thanks for time and help

@dansleboby
Copy link
Author

dansleboby commented Dec 11, 2024

I try on linux too same result

Processing file: dsc00004_1700085011.JPG
MEM:  15.46 mb
PEAK: 15.49 mb
Processing file: dsc00005_1700085017.JPG
MEM:  29.85 mb
PEAK: 29.86 mb
Processing file: dsc00006_1700085022.JPG
MEM:  44.21 mb
PEAK: 44.23 mb
Processing file: dsc00010_1731530648.JPG
MEM:  59.18 mb
PEAK: 59.19 mb
Processing file: dsc00011_1731530660.JPG
MEM:  73.85 mb
PEAK: 73.87 mb
Processing file: dsc00012_1731530667.JPG
MEM:  88.18 mb
PEAK: 88.2 mb

env

Debian 12

PHP 8.2.26 (cli) (built: Nov 25 2024 17:21:51) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.26, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.26, Copyright (c), by Zend Technologies

@indy2kro
Copy link

indy2kro commented Jan 5, 2025

The bug actually comes from file_get_contents() usage which leads to memory leaks when files are bigger (from my tests this behavior seems to occur only for files larger than about 2MB so probably related to some kind of buffer size).
The only real solution to this would be to not load the files entirely in memory and use a file stream instead.

@indy2kro
Copy link

indy2kro commented Jan 5, 2025

This would be my approach to fix it: indy2kro@9115d66

@dansleboby
Copy link
Author

This help for big file, but the memory issue is still present

Without gc_collect_cycles(); (output_tests_without_gc.csv
)
image

With gc_collect_cycles(); (output_tests.csv)
image

This is my new tests file using tests file from the repo

<?php

require 'vendor/autoload.php';


use lsolesen\pel\PelJpeg;

function bytes($size) {
	$unit = [
		'b',
		'kb',
		'mb',
		'gb',
		'tb',
		'pb'
	];
	return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}

$files = glob("vendor/fileeye/pel/test/imagetests/*.jpg");

$csv = fopen('output_tests.csv', 'w');

for($i = 0; $i <= 100; $i++) {
	echo "Pass: " . $i . PHP_EOL;

	foreach($files as $imagePath) {
		//echo "Processing file: " . $imagePath . PHP_EOL;
		try {
			// Load the JPEG file
			$jpeg = new PelJpeg($imagePath);
			gc_collect_cycles();
		} catch(Exception $e) {
			echo 'Error: ' . $e->getMessage() . PHP_EOL;
		}

		fputcsv($csv, [
			$i,
			microtime(true),
			basename($imagePath),
			memory_get_usage(),
			memory_get_peak_usage()
		]);
	}
}

printf("MEM:  %s\nPEAK: %s\n", bytes(memory_get_usage()), bytes(memory_get_peak_usage()));

// Save the CSV data
if(!empty($csvData)) {
	fputcsv($csv, $csvData);
}

fclose($csv);

I also do a test with a file_get_contents:

<?php

require 'vendor/autoload.php';

function bytes($size) {
	$unit = [
		'b',
		'kb',
		'mb',
		'gb',
		'tb',
		'pb'
	];
	return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}

$files = glob("vendor/fileeye/pel/test/imagetests/*.jpg");

$csv = fopen('output_tests.csv', 'w');

for($i = 0; $i <= 100; $i++) {
	echo "Pass: " . $i . PHP_EOL;

	foreach($files as $imagePath) {
		//echo "Processing file: " . $imagePath . PHP_EOL;
		try {
			// Load the JPEG file
			//$jpeg = new PelJpeg($imagePath);
			$jpeg = file_get_contents($imagePath);
			gc_collect_cycles();
		} catch(Exception $e) {
			echo 'Error: ' . $e->getMessage() . PHP_EOL;
		}

		fputcsv($csv, [
			$i,
			microtime(true),
			basename($imagePath),
			memory_get_usage(),
			memory_get_peak_usage()
		]);
	}
}

printf("MEM:  %s\nPEAK: %s\n", bytes(memory_get_usage()), bytes(memory_get_peak_usage()));

// Save the CSV data
if(!empty($csvData)) {
	fputcsv($csv, $csvData);
}

fclose($csv);

image

PS: The code for the graph gen ( https://gist.github.com/dansleboby/68b93c34fad8d6e23bd65068f05be84d )

@indy2kro
Copy link

indy2kro commented Jan 5, 2025

Well, that is kind of to be expected since basically both of the snippets of code end up doing the same thing - file_get_contents and indeed calling manually garbage collector doesn't seem to do any anything in reality.
If you are up to testing you can give it a try with the code from the fork and see if it works at least.
You should be able to test temporarily by adding to your composer.json:

  "repositories": [
    {
      "type": "vcs",
      "url": "https://github.com/indy2kro/pel",
      "no-api": true,
      "name": "pel"
    }
  ],

The stream functionality should work by default with the same code - from my tests the difference is quite easy to notice.

@dansleboby
Copy link
Author

{
  "require": {
    "fileeye/pel": "dev-master"
  },
  "repositories": [
    {
      "type": "vcs",
      "url": "https://github.com/indy2kro/pel"
    }
  ]
}

I'm already on your fork :)

If I use the "fileeye/pel": "0.11.0" with same code as the last test with manual GC I get

image

MEM: 38.46 mb
PEAK: 38.75 mb

So the stream made a huge global improvement, but the issue still present

@indy2kro
Copy link

indy2kro commented Jan 5, 2025

That behavior comes from the way exceptions are handled inside Pel class, you can just call Pel::clearExceptions(); between batches (or after every operation).

E.g. this works fine for me:

for($i = 0; $i <= 100; $i++) {
	echo "Pass: " . $i . PHP_EOL;

	foreach($files as $imagePath) {
		//echo "Processing file: " . $imagePath . PHP_EOL;
		try {
			// Load the JPEG file
			$jpeg = new PelJpeg($imagePath);
		} catch(Exception $e) {
			echo 'Error: ' . $e->getMessage() . PHP_EOL;
		}

		@fputcsv($csv, [
			$i,
			microtime(true),
			basename($imagePath),
			memory_get_usage(),
			memory_get_peak_usage()
		]);
	}

        gc_collect_cycles();
        Pel::clearExceptions();
}

@dansleboby
Copy link
Author

OMG

Pel::clearExceptions();

was the missing thing, since it call statictly it keep it values for all run. With this the memory is now ok ( thanks a lot 😍 )
image

I think it should be clear in the PelJpeg and PelTiff in the load or __construct functions because I don't see a use case to keep all exception of all call of PelJpeg / PelTiff

@mondrake
Copy link

@dansleboby @indy2kro 0.12 is released. Can we close this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants