Skip to content

sleimanx2/grawler

Repository files navigation

Grawler

Software License Build Status

Install

Via Composer

$ composer require sleimanx2/grawler

Basic Usage

getting the page dom
require_once('vendor/autoload.php');

$client = new Bowtie\Grawler\Client();

$grawler = $client->download('http://example.com');
finding basic attributes
$grawler->title();
// provide a css path to find the attribute
$grawler->body($path = '.main-content');
// extracts meta keywords (array)
$grawler->keywords();
// extracts meta description 
$grawler->description();
finding media
$grawler->images('.content img');
$grawler->videos('iframe');
$grawler->audio('.audio iframe');

Resolving media attributes

In order resolve media attributes you need to load providers's configuration

videos

Current video resolvers (youtube , vimeo)

// resolve all videos at once 
$videos = $grawler->videos('iframe')->resolve();

then you can access videos attributes as follow

foreach($videos as $video)
{
  $video->id; // the video provider id
  $video->title;
  $video->description;
  $video->url;
  $video->embedUrl;
  $video->images; // Collection of Image instances
  $video->author;
  $video->authorId;
  $video->duration;
  $video->provider; //video source
}

you can also resolve videos individually as follow

$videos = $grawler->videos('iframe')

foreach($videos as $video)
{
  $video->resolve();
  $video->title;
  //...
}

audio

Current video resolvers (soundcloud)

// resolve all audio at once 
$audio = $grawler->audio('.audio iframe')->resolve();

then you can access videos attributes as follow

foreach($audio as $track)
{
  $track->id; // the video provider id
  $track->title;
  $track->description;
  $track->url;
  $track->embedUrl;
  $track->images; // Collection of cover photo instances
  $track->author;
  $track->authorId;
  $track->duration;
  $track->provider; //video source
}

you can also resolve audio individually as follow

$track = $grawler->track('.audio iframe')

foreach($audio as $track)
{
  $track->resolve();
  $track->title;
  //...
}

Resolving page urls

$links = $grawler->links('.main thumb a')

foreach($links as $link)
{
  print $link
  //or
  print $link->uri
  //or
  print $link->getUri()
}

Configuration

Client Config

Set user agent
$client->agent('Googlebot/2.1')->download('http://example.com');

Recomended : http://webmasters.stackexchange.com/questions/6205/what-user-agent-should-i-set

Set request auth
$client->auth('me', '**')

you can change the auth type as follow

$client->auth('me', '**', $type = 'basic');
Set request method
$client->method('post');

Grawler config

By default the grawler tries to access those environment variables

GRAWLER_YOUTUBE_KEY

GRAWLER_VIMEO_KEY
GRAWLER_VIMEO_SECRET

GRAWLER_SOUNDCLOUD_KEY
GRAWLER_SOUNDCLOUD_SECRET

if you don't use env vars you can load configuration as follow.

$config = [
  'youtubeKey'   =>'',
  'soundcloudKey'=>''

  'vimeoKey'    => '',
  'vimeoSecret' => '',

  'soundcloudKey'    => '',
  'soundcloudSecret' => '',
];

$grawler->loadConfig($config);

Testing

$ phpunit --testsuite unit
$ phpunit --testsuite integration

NB: you should set your ptoviders key (youtube,vimeo,soundcloud...) to run integration tests

Contributing

Please see CONTRIBUTING

Security

If you discover any security related issues, please email [email protected] instead of using the issue tracker.

License

The MIT License (MIT). Please see License File for more information.

About

please don't use it!

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •