New DFL site coming

Feb 2 is the tentative due date for the DFL rewrite. Actually it’s the first phase of the upgrade, which will bring it up to speed with the current site in many ways. Additional functionality will be added every few weeks after that.

One of the biggest new features in this update will be the addition of the MRI dataset viewer/rendering application. Let’s just say this is groundbreaking work – the first time anything like this has been done successfully on the web. My coworker, German (yes, that’s his real name), has been working very hard on this application.

DFL update

My boss wants a nice looking front-end to the new DFL site, even if all the links are broken, so expect something next week. As it currently stands the new design is going to borrow heavily from the current one, though it should use space a little bit better and display information a bit more logically. You can see the current site at http://www.digitalfishlibrary.org/. The new site redesign is taking advantage of the Smarty Template Engine, so expect to see some Smarty tutorials in the near future.

File access control

… and bandwidth throttling, using PHP.

If you’re in a situation where you need to control who has access to certain files/downloads, PHP can do it all for you. Even better, you can use the same script to throttle how quickly data is sent to the user. Let’s take a look at this problem:

First, you obviously need some kind of files that need protection. Make a downloads directory and stick a .htaccess file in there with the following contents:
Options -Indexes
Order Deny, Allow
Deny from all
Allow from none

Of course you’ll want to also put a file worth downloading: topsecretdownload.pdf

Create a file in your web root called download.php. In it you’re going to put some fairly straightforward code to send a file to only authorized users. There are a few things you should take care of yourself, such as proper escaping of user-supplied data and user validation (they’re beyond the scope of this post).

//clean input so you don't have any nasties.
//You'll have to provide a similar function yourself
$fname=cleaninput($file);
$path='downloads/'.$fname;

//verify the user
$valid_user=is_valid_user($userid);

if($valid_user && is_file($path))
{
   //sends the appropriate file header, sends file contents
   header=('Content-type: application/pdf');
   readfile($path);
}

And there you have it! The files are safe from snooping via requests directly to Apache, and you can even make sure only the right people can download topsecretfile.pdf

If you want to throttle download for some reason, replace the readfile($file); line with the following code:

    $handle=fopen($file,'rb');

    //number of kilobytes per second you want to
   //throttle the file download to.
    $rate = 20; 
    while(!feof($handle) && (connnection_status()==0))
      {
        //tells the script to sleep for one second
        sleep(1);

        //reads 1024*$rate bytes at a time
        print(fread($handle, 1024*$rate);

        //send the output buffer
        flush();
      }

So, in this example, we don’t want any users to download faster than 20 kilobytes per second. sleep(1) basically makes sure the while() loop doesn’t execute faster than 1 time per second, though sleep() can be used in many other situations besides this.

One thing to point out: if you’re checking user authorization against a database, you probably want to close the db connection before you start sending the file over. This is because PHP will keep a connection to the DB open as long as the script is running, or until you explicitly close the connection before the end of the script. This is especially important as site traffic increases (there’s a chance of hitting the max allowed connections to the database).

Finally, you can always use mod_rewrite to make nicer URLs: http://www.domain.ext/files/topsecretfile.pdf instead of http://www.domain.ext/download.php?file=topsecretfile.pdf. I’ll leave that to another quick totorial.

Enjoy!

edit: I just realized WordPress removed part of my code sample. Everything should be working fine now.

Digital Fish Library

I’d like to introduce you to the Digital Fish Library, of which I’m a developer on. The idea of this library is to catalog MRI data from hundreds of fish species worldwide and provide online analysis tools. It’s like open source resarch, so to speak. Scientists can log in and do dissections and other analyses if they wish. My job is to collect and post the data and develop the website.

The project is currently at http://www.digitalfishlibrary.org/ if you’re interested in taking a look. I’m in the process of completely rewriting the codebase to take advantage of the Smarty template engine and PHP5’s support for object oriented programming.

http://www.digitalfishlibrary.org/
http://www.php.net/
http://smarty.php.net/

ApacheCon Photos

I’m sorry this is such a lame post, but I’m in one of Chris Shiflett’s photos before Rasmus’s talk at ApacheCon.
This Photo Me: The second person from the left, sitting down. You can really only see the back of my head, and my shoulders, but I’m physically in-front of the camera man. Rasmus: the guy at the lectern.

Check-out Chris Shiflett’s blog at http://www.shiflett.org/. He has dozens of ApacheCon photos, mostly of the other PHP guys there. While you’re at it, cruise on over to the PHP Security Consortium site and learn something new:http://www.phpsec.org/

The Meat at ApacheCon

Today really was my day at ApacheCon. Four of the five talks were on things I’m truly interested in – mostly PHP (see previous post). Rasmus gave an interesting talk about using PHP at Yahoo!. He gave some particulars about making high-performance, scalable systems. The other portion of his talk focused around XML support in PHP 5, as well as SOAP and REST services at Yahoo! (including a pretty cool Yahoo! Maps demonstration). There’s a similar demo on his toys blog: http://toys.lerdorf.com/. There were times, however, when he went a little to deep into the details, though I don’t think they detracted from the quality of the talk.

There was another good talk called “Consuming Web Services using PHP 5” by Adam Trachtenberg (eBay). For the amount of time allotted I think it was a pretty good discussion on what to expect when working on REST and SOAP clients.

Scalable Web Architectures: decent. It’s one of those that really got me thinking about how German and I are going to design the DFL system (fewer hits, but extremely high bandwidth per user).

Now for the fun part of this post: Ruby on Rails (RoR). “Cheap, fast, and Good. You can have it all with Ruby on Rails.” It seems like every RoR demonstration I’ve seen fails to really capture a whole lot of attention from the average web developer, including this one. When the presenter, who I believe is one of the main developers of RoR, says that a lot of it is “magic” that scares him because he doesn’t really know what’s going on, what are we supposed to think? Yeah – it’s great that they can make these easy to install frameworks, but you can’t deny that some amount of programming has to go into developing the framework, and after that, the consumer developers still have to figure it all out (or in many cases, practice some kind of voodoo automagical programming methods). Put it this way – it didn’t seem like a lot of those people were very excited after the talk. It appears RoR will remain a novelty for some time to come.

ApacheCon day 3

I’m in the company of celebrities in the world of PHP.

I just finished listening to a talk by Andrei Zmievski on Unicode character support in the upcoming PHP 6. Though I don’t know much about supporting multiple character sets (I’ve had not reason just yet to internatiionalize my code), I do know the problem/difficulty that PHP has with internationalization. Andrei did a very good job of explaining not only what the problems are in the current 5.1.x release, but also how PHP 6 is going to address these items specifically. Without going into any detail here, I can safely say that our jobs are going to be much easier.

The next talk is by Rasmus Lerdorf, the father of PHP. Why is this a big deal? PHP is not only an easy dynamic language to learn, but it’s also currently the most popular, and fastest growing, language on the web (according to sourceforge and other code repositories). His talk is going to be on large-scale PHP. Not quite where I’m at … YET, but these kinds of things are great because they tend to be very concerned with optimization and scaling.

Oh yeah – Christian Wenz and Chris Shiflett are also at these talks. Reminder: Talk to Chris and Christian about securing files w/ Denying access to directories and files via Apache, but reading through PHP if user has appropriate privs. Oh yeah – and the book, maybe.

Link: http://talks.lerdorf.com/show/acon05

Hey there! Come check out all-new content at my new mistercameron.com!