path: root/posts/2009-04-27-a-simple-feed-aggregator-with-modern-perl-part-1.org



Following
[[http://www.shadowcat.co.uk/blog/matt-s-trout/iron-man/][Matt's post]]
about people not blogging enough about Perl, I've decided to try to post
once a week about Perl. So I will start by a series of articles about
what we call *modern Perl*. For this, I will write a simple feed
agregator (using [[https://metacpan.org/pod/Moose][Moose]],
[[http://search.cpan.org/perldoc?DBIx::Class][DBIx::Class]],
[[http://search.cpan.org/perldoc?KiokuDB][KiokuDB]], some tests, and a
basic frontend (with
[[http://search.cpan.org/perldoc?Catalyst][Catalyst]]). This article
will be split in four parts:

-  the first one will explain how to create a schema using *DBIx::Class*
-  the second will be about the aggregator. I will use *Moose** and
   *KiokuDB*
-  the third one will be about writing tests with *Test::Class*
-  the last one will focus on *Catalyst*

The code of these modules will be available on
[[http://git.lumberjaph.net/][my git server]] at the same time each
article is published.

#+BEGIN_QUOTE
  I'm not showing you how to write the perfect feed aggregator. The
  purpose of this series of articles is only to show you how to write a
  simple aggregator using modern Perl.
#+END_QUOTE

*** The database schema

We will use a database to store a list of feeds and feed entries. As I
don't like, no, wait, I /hate/ SQL, I will use an ORM for accessing the
database. For this, my choice is *DBIx::Class*, the best ORM available
in Perl.

#+BEGIN_QUOTE
  If you never have used an ORM before, ORM stands for Object Relational
  Mapping. It's a SQL to OO mapper that creates an abstract
  encapsulation of your databases operations. *DBIx::Class*' purpose is
  to represent "queries in your code as perl-ish as possible.
#+END_QUOTE

For a basic aggregator we need:

-  a table for the list of feeds
-  a table for the entries

We will create these two tables using /DBIx::Class/. For this, we first
create a Schema module. I use /Module::Setup/, but you can use
*Module::Starter* or whatever you want.

#+BEGIN_EXAMPLE
    % module-setup MyModel
    % cd MyModel
    % vim lib/MyModel.pm
#+END_EXAMPLE

#+BEGIN_SRC perl
    package MyModel;
    use base qw/DBIx::Class::Schema/;
    __PACKAGE__->load_classes();
    1;
#+END_SRC

So, we have just created a schema class. The *load\_classes* method
loads all the classes that reside under the *MyModel* namespace. We now
create the result class *MyModel::Feed* in *lib/MyModel/Feed.pm*:

#+BEGIN_SRC perl
    package MyModel::Feed;
    use base qw/DBIx::Class/;
    __PACKAGE__->load_components(qw/Core/);
    __PACKAGE__->table('feed');
    __PACKAGE__->add_columns(qw/ feedid url /);
    __PACKAGE__->set_primary_key('feedid');
    __PACKAGE__->has_many(entries => 'MyModel::Entry', 'feedid');
    1;
#+END_SRC

Pretty self explanatory: we declare a result class that uses the table
feed, with two columns: *feedid* and *url*, *feedid* being the primary
key. The *has\_many* method declares a one-to-many relationship.

Now the result class *MyModel::Entry* in *lib/MyModel/Entry.pm*:

#+BEGIN_SRC perl
    package MyModel::Entry;
    use base qw/DBIx::Class/;
    __PACKAGE__->load_components(qw/Core/);
    __PACKAGE__->table('entry');
    __PACKAGE__->add_columns(qw/ entryid permalink feedid/);
    __PACKAGE__->set_primary_key('entryid');
    __PACKAGE__->belongs_to(feed => 'MyModel::Feed', 'feedid');
    1;
#+END_SRC

Here we declare *feed* as a foreign key, using the column name *feedid*.

You can do a more complex declaration of your schema. Let's say you want
to declare the type of your fields, you can do this:

#+BEGIN_SRC perl
    __PACKAGE__->add_columns(
        'permalink' => {
            'data_type'         => 'TEXT',
            'is_auto_increment' => 0,
            'default_value'     => undef,
            'is_foreign_key'    => 0,
            'name'              => 'url',
            'is_nullable'       => 1,
            'size'              => '65535'
        },
    );
#+END_SRC

*DBIx::Class* also provides hooks for the deploy command. If you are
using MySQL, you may need a InnoDB table. In your class, you can add
this:

#+BEGIN_SRC perl
    sub sqlt_deploy_hook {
        my ($self, $sqlt_table) = @_;
        $sqlt_table->extra(
            mysql_table_type => 'InnoDB',
            mysql_charset    => 'utf8'
        );
    }
#+END_SRC

Next time you call deploy on this table, the hook will be sent to
*SQL::Translator::Schema*, and force the type of your table to InnoDB,
and the charset to utf8.

Now that we have a *DBIx::Class* schema, we need to deploy it. For this,
I always do the same thing: create a *bin/deploy\_mymodel.pl* script
with the following code:

#+BEGIN_SRC perl
    use strict;
    use feature 'say';
    use Getopt::Long;
    use lib('lib');
    use MyModel;

    GetOptions(
        'dsn=s'    => \my $dsn,
        'user=s'   => \my $user,
        'passwd=s' => \my $passwd
    ) or die usage();

    my $schema = MyModel->connect($dsn, $user, $passwd);
    say 'deploying schema ...';
    $schema->deploy;

    say 'done';

    sub usage {
        say
            'usage: deploy_mymodel.pl --dsn $dsn --user $user --passwd $passwd';
    }
#+END_SRC

This script will deploy for you the schema (you need to create the
database first if using with mysql).

Executing the following command
=perl bin/deploy_mymodel.pl --dsn dbi:SQLite:model.db= will generate a
*model.db* database so we can work and test it. Now that we got our
(really) simple *MyModel* schema, we can start to hack on our
aggregator.

[[http://git.lumberjaph.net/p5-ironman-mymodel.git/][The code is
available on my git server]].

#+BEGIN_QUOTE
  while using *DBIx::Class*, you may want to take a look at the
  generated queries. For this, export =DBIC_TRACE=1= in your
  environment, and the queries will be printed on STDERR.
#+END_QUOTE
Following
[[http://www.shadowcat.co.uk/blog/matt-s-trout/iron-man/][Matt's post]]
about people not blogging enough about Perl, I've decided to try to post
once a week about Perl. So I will start by a series of articles about
what we call *modern Perl*. For this, I will write a simple feed
agregator (using [[https://metacpan.org/pod/Moose][Moose]],
[[http://search.cpan.org/perldoc?DBIx::Class][DBIx::Class]],
[[http://search.cpan.org/perldoc?KiokuDB][KiokuDB]], some tests, and a
basic frontend (with
[[http://search.cpan.org/perldoc?Catalyst][Catalyst]]). This article
will be split in four parts:

-  the first one will explain how to create a schema using *DBIx::Class*
-  the second will be about the aggregator. I will use *Moose** and
   *KiokuDB*
-  the third one will be about writing tests with *Test::Class*
-  the last one will focus on *Catalyst*

The code of these modules will be available on
[[http://git.lumberjaph.net/][my git server]] at the same time each
article is published.

#+BEGIN_QUOTE
  I'm not showing you how to write the perfect feed aggregator. The
  purpose of this series of articles is only to show you how to write a
  simple aggregator using modern Perl.
#+END_QUOTE

*** The database schema

We will use a database to store a list of feeds and feed entries. As I
don't like, no, wait, I /hate/ SQL, I will use an ORM for accessing the
database. For this, my choice is *DBIx::Class*, the best ORM available
in Perl.

#+BEGIN_QUOTE
  If you never have used an ORM before, ORM stands for Object Relational
  Mapping. It's a SQL to OO mapper that creates an abstract
  encapsulation of your databases operations. *DBIx::Class*' purpose is
  to represent "queries in your code as perl-ish as possible.
#+END_QUOTE

For a basic aggregator we need:

-  a table for the list of feeds
-  a table for the entries

We will create these two tables using /DBIx::Class/. For this, we first
create a Schema module. I use /Module::Setup/, but you can use
*Module::Starter* or whatever you want.

#+BEGIN_EXAMPLE
    % module-setup MyModel
    % cd MyModel
    % vim lib/MyModel.pm
#+END_EXAMPLE

#+BEGIN_SRC perl
    package MyModel;
    use base qw/DBIx::Class::Schema/;
    __PACKAGE__->load_classes();
    1;
#+END_SRC

So, we have just created a schema class. The *load\_classes* method
loads all the classes that reside under the *MyModel* namespace. We now
create the result class *MyModel::Feed* in *lib/MyModel/Feed.pm*:

#+BEGIN_SRC perl
    package MyModel::Feed;
    use base qw/DBIx::Class/;
    __PACKAGE__->load_components(qw/Core/);
    __PACKAGE__->table('feed');
    __PACKAGE__->add_columns(qw/ feedid url /);
    __PACKAGE__->set_primary_key('feedid');
    __PACKAGE__->has_many(entries => 'MyModel::Entry', 'feedid');
    1;
#+END_SRC

Pretty self explanatory: we declare a result class that uses the table
feed, with two columns: *feedid* and *url*, *feedid* being the primary
key. The *has\_many* method declares a one-to-many relationship.

Now the result class *MyModel::Entry* in *lib/MyModel/Entry.pm*:

#+BEGIN_SRC perl
    package MyModel::Entry;
    use base qw/DBIx::Class/;
    __PACKAGE__->load_components(qw/Core/);
    __PACKAGE__->table('entry');
    __PACKAGE__->add_columns(qw/ entryid permalink feedid/);
    __PACKAGE__->set_primary_key('entryid');
    __PACKAGE__->belongs_to(feed => 'MyModel::Feed', 'feedid');
    1;
#+END_SRC

Here we declare *feed* as a foreign key, using the column name *feedid*.

You can do a more complex declaration of your schema. Let's say you want
to declare the type of your fields, you can do this:

#+BEGIN_SRC perl
    __PACKAGE__->add_columns(
        'permalink' => {
            'data_type'         => 'TEXT',
            'is_auto_increment' => 0,
            'default_value'     => undef,
            'is_foreign_key'    => 0,
            'name'              => 'url',
            'is_nullable'       => 1,
            'size'              => '65535'
        },
    );
#+END_SRC

*DBIx::Class* also provides hooks for the deploy command. If you are
using MySQL, you may need a InnoDB table. In your class, you can add
this:

#+BEGIN_SRC perl
    sub sqlt_deploy_hook {
        my ($self, $sqlt_table) = @_;
        $sqlt_table->extra(
            mysql_table_type => 'InnoDB',
            mysql_charset    => 'utf8'
        );
    }
#+END_SRC

Next time you call deploy on this table, the hook will be sent to
*SQL::Translator::Schema*, and force the type of your table to InnoDB,
and the charset to utf8.

Now that we have a *DBIx::Class* schema, we need to deploy it. For this,
I always do the same thing: create a *bin/deploy\_mymodel.pl* script
with the following code:

#+BEGIN_SRC perl
    use strict;
    use feature 'say';
    use Getopt::Long;
    use lib('lib');
    use MyModel;

    GetOptions(
        'dsn=s'    => \my $dsn,
        'user=s'   => \my $user,
        'passwd=s' => \my $passwd
    ) or die usage();

    my $schema = MyModel->connect($dsn, $user, $passwd);
    say 'deploying schema ...';
    $schema->deploy;

    say 'done';

    sub usage {
        say
            'usage: deploy_mymodel.pl --dsn $dsn --user $user --passwd $passwd';
    }
#+END_SRC

This script will deploy for you the schema (you need to create the
database first if using with mysql).

Executing the following command
=perl bin/deploy_mymodel.pl --dsn dbi:SQLite:model.db= will generate a
*model.db* database so we can work and test it. Now that we got our
(really) simple *MyModel* schema, we can start to hack on our
aggregator.

[[http://git.lumberjaph.net/p5-ironman-mymodel.git/][The code is
available on my git server]].

#+BEGIN_QUOTE
  while using *DBIx::Class*, you may want to take a look at the
  generated queries. For this, export =DBIC_TRACE=1= in your
  environment, and the queries will be printed on STDERR.
#+END_QUOTE