Author Archive

Sphinx With MAMP

It’s late and I’m pretty damn sleepy so I’ll just do this in quick lists. Today I was trying to install Sphinx on my local machine, a Macbook Pro. I’m using MAMP 1.9 to dev a new project which will involve Sphinx. However, installing Sphinx turned out to be kind of a pain in the ass. Just as a note, my laptop wasn’t running mysql to begin with.

The problem occurred when trying to install Sphinx (latest stable version – 0.9.9) with mysql support. Apparently MAMP 1.9 doesn’t come with all the mysql files that Sphinx needs to install with mysql support. Doing ./configure on Sphinx just kept leading to dead ends and a message asking me to install the mysql-devel package or explicitly specify MySQL library folders and such and specifying various folders inside of MAMP didn’t turn out well.

Now I’m sure there’s a more ‘ninja’ way to do this but this is what worked for me:

  • Install MySQL. Download the DMG from http://dev.mysql.com/downloads/mysql/. Make sure to download the correct one (32-bit vs. 64-bit). After that it’s a surprisingly simple install. You don’t need to start the service or install the MySQLStartupItem.pkg.
  • After that, ./configure on sphinx worked just fine since MySQL and all the various files are in all the standard locations. Make sure to include the path to install to via the –prefix option. The typical Sphinx installation usually goes in /usr/local/sphinx/ so: ./configure –prefix=/usr/local/sphinx
  • Now do make and sudo make install
  • Everything else is fairly straightforward.

Some extra notes:

  • When configuring your sphinx.conf file, make sure to set sql_sock = /Applications/MAMP/tmp/mysql/mysql.sock when editing your sources. Also don’t forget to specify the right port (MAMP defaults to 8889 for MySQL).
  • Not sure why but it kept trying to look for sphinx.conf in /usr/local/etc/sphinx.conf so when running indexer or search I would have to explicitly specify the location of the sphinx.conf file.
  • If you get permission errors when doing indexer it’s probably because you forgot to sudo.

Drop me a comment if this helped you any or if you have other questions!

Leave a Comment

Importing PhotoPost Pro photos to vBulletin Albums

Background
First, a little background on this post. I own a socal mountain biking forum, www.socaltrailriders.org and recently had to upgrade from 3.8.2 to 4.04. Along the way I also updated vbSEO, and our classifieds system to make them compatible with 4.04 which was a major upgrade. However, we were using Photopost Pro for our photo gallery software but I wanted to move to vBulletin’s albums after having some issues with Photopost Pro.

I researched ways to import from Photopost Pro to vBulletin albums but found no solution. Being a dveloper myself, I decided to make one. This would be my first time making any import script for third party software but I figured it couldn’t be too crazy, right?

The Research
I started off by doing some exploration and figuring out how files and database data are laid out for both Photopost and vBulletin’s album system. I had to dig in a little bit to really figure out how both pieces of software really ticked but now have a fairly good grasp of how it’s structured which is critical if you want to make an import script. vBulletin actually had a bit more nuances than Photopost which was fairly straightforward to figure out. For example, for a user with ID of 191, vbulletin stores attachments in the path “[attachment dir]/1/9/1/”.

Validation and Checking

I’m a strong believer in building code that handles all types of situations in a graceful manner. What if Photopost’s database references a file that no longer exists? What if you’re trying to move files somewhere that you don’t have write access to? These are just a couple of situations that you have to account for in an import script. So, I built a ton of it into this import script and tested it until I was very confident that it would handle every situation in an appropriate manner.

“Features”

I say “features” because this is more of an overview of the functionality of my photopost pro -> vb import script. But I’ll just list the ones that come to mind:

  • categories are preserved and can be configured a bit. categories created by members in Photopost will become albums in vBulletin. on our Photopost gallery, we had a series of categories that were shared amongst a lot of members. since vBulletin albums aren’t shared, I coded the import script to create a new album for the user if they had uploaded to one of those common albums.
  • Also, it can import multiple Photopost categories into one album. For example – on our Photopost gallery we had close to ten categories related to bikes which everybody shared. I configured it so that if I uploaded 4 photos across four of those bike categories, it imported those photos into one vbulletin album called “My Bikes” for me.
  • Photo titles and view counts were preserved.
  • Thumbnails were created based on the max attachment width specified inside of vBulletin’s options.
  • Lots of other little things…

What it doesn’t do: preserve comments or permissions.

Do You Need a PhotoPost Pro -> vBulletin Album Import?

I realize some of you out there may want or need to move from PhotoPost Pro to vBulletin so I’ll offer this import as a paid service. Comment or email me for information about that.

albums in vbulletin freshly imported from photopost pro

Comments (5)

More Speed: CSS Sprites!

Hey folks! This ones for those speed freaks who are never satisfied with their page load times. This is for good reason. Page load time could mean the difference between a conversion and a lost visitor.

Why?

There are many, many things you can do to improve site performance. Today I’ll explain CSS sprites and why they can help with performance. The goal of using CSS sprites is to minimize the number of HTTP requests. Each HTTP requests has overhead – the browser has to send a request to your web server and then the web server has to respond with an answer (which could include an image, css file, etc). So minimize the amount of HTTP requests and you speed up your page.

What’s a CSS Sprite?

What the heck is this CSS sprite thing that I’m talking about? Well you know all those little background images that you put on your webpage via CSS? They may be round corners, gradients, icons, etc. Well each of those comes with the price of HTTP requests. Using CSS sprites is essentially a technique of combining some of those background images into one image (called the sprite) to minimize the number of HTTP requests.

How?

Luckily there is a tool that makes generating this CSS sprite super easy: SpriteMe. It was created by Steve Sounders, a performance guru. Once you create your CSS Sprite, use it in your CSS by utilizing background-position. Using background-position, you can adjust which part of the CSS Sprite shows up, allowing you to reuse the CSS Sprite throughout your page. On a recent project I was involved with, we reduced the number of HTTP requests by 13!
example of css sprite

Interested in more performance tips? Feel free to check out the Yahoo Developer Network, they have some great advice on this subject.

Leave a Comment

Minify! And New Site! Ohnoes!

This one will be quick.

Just got done configuring minify onto my local dev environment and then onto the live environment. It rocks! It was a little fussy about the setup, but just pay close attention to the config.php file and you’ll be fine. It works as advertised and definitely helps on load time.

Oh – and I did this on my new website which I “soft-launched” yesterday. It’s a site that allows you to search for and buy shoes via a super easy to use website. The content is intelligently cached from Amazon’s web services. Keep an eye out for a post going into further detail on the implementation.

Update: CakePHP published my article about Got2BShoes in their Bakery.

Leave a Comment

Google Analytics Anomaly: More Visits than Page Views?

wtfBy definition, page views on any given site should be always be higher than the number of visits. However, myself and some other people have came across an anomaly in Google Analytics where visits are the one that is higher.

I don’t have a solution to offer yet, however I do have an explanation…

This is related to sub profiles and events. If you have a parent profile with events and make sub profiles from it, you will most likely run into this issue (until it is solved by Google). I’ll first explain events and sub profiles. If you’re already familiar, feel free to skip the next two sections.

Events

Events are not associated with page views at all. That was actually a perk of using an event to keep track of non page view type information. In the past, we had to use trackPageview() to reports that didn’t occur ‘naturally’. The drawback of that is that it created fake page views which inflated our page view numbers. Using events, that information is kept separate from page views, allowing us to gather useful data and still trust our page view figures.

Sub Profiles

In Google Analytics, you have a unique ID for every parent profile you have a unique Web Property ID and a parent profile for that web property. You can then create other profiles for that web property and filter out various data. The filters usually take out traffic to certain URL’s, certain hostnames, etc etc. This is useful when you want to view only a subset of data or want to see an abstraction of the data.

Ohnoes!

the event visitsThe anomaly occurs because filters in sub profiles won’t also filter events. So what? The events are associated with visits. Since events aren’t filtered, the visits come with them. If your parent profile has 562k visits with events (random number) then your sub profile will automatically have 562k visits even if you filtered out ALL traffic data. That’s exactly what happened in the screenshot from the first paragraph.

 

 

Hopefully that saves some head scratching out there. I’ll post again if I find a solution or if I notice that Google takes action on it.

Comments (7)

Sphinx Search: An Appetizer

Recently I’ve been playing with Sphinx which promises to be a great site search engine solution at my place of work. This post isn’t meant to be a comprehensive tutorial but a brief overview meant to wet your appetite.

What Is Sphinx?

Sphinx is a standalone search engine that can be used to power search capability in many applications. It’s extremely quick, relevant, scalable, and highly configurable.

If you’re trying to create search functionality and using MySQL to do ‘LIKE’ searches, I highly recommend you at least look into using sphinx.

Get Sphinx

Download the source here at this link. My install experience on a Linux machine went very smoothly. It’s simply a matter of unpacking and then doing
./configure, make, make install. I decided to do ./configure --prefix=/usr/local/sphinx as that is the prefix used in the Sphinx documentation.

Here’s a quick rundown on the contents of your installation:

  • [sphinx dir]/etc/ – sphinx configuration file goes here
  • [sphinx dir]/var/data/ – index files
  • [sphinx dir]/bin/ – useful command line tools and the search daemon

Simple enough so far, right?

Setting Up Sphinx

Before you can start searching, you need to edit and create a sphinx.conf file. Go into the [sphinx dir]/etc/ folder and copy the example sphinx configuration file. Go through it and edit it to your hearts desire. Make sure to become good friends with the documentation as it’ll walk you through each and every available option.

The heart of config file is as follows:

  • Define sources. Each source includes an SQL query. This query is the information you want to be searchable. You can even include fields in this query which you declare as attributes. Afterwards, you’ll be able to sort and/or filter by these attributes.
  • Define indices. Each index points to a source and includes various additional options for how the information is searched. There can be multiple indices pointing to the same source. When searching, you have the ability to search one or more specific indices.

But again, make sure to check out Sphinx’s own documentation. After setting up your config file, run the [sphinx bin]/bin/indexer tool to collect the data and make your indices.

Use Sphinx!

To search directly on the index without going through the daemon, use [sphinx dir]/bin/search. Doing this after running indexer is a good idea as it bypasses api’s and daemons, which can be a source of bugs or confusion. You might also want to play with [sphinx dir]/bin/indextool which will give you some information about the indices you just created and along with [sphinx dir]/bin/search, can prove to be great debugging tools.  

If the search looks to be working well, go ahead and turn on the [sphinx dir]/bin/searchd daemon and try the API’s. Currently Sphinx provides API’s for PHP, Python, and Java. My experience was using the PHP version. These API’s have very useful options related to sorting, field weighting, filtering, etc.

Ohnoes – API/daemon trouble!

This caused me hours of head scratching so I’m hoping I can save some people a bit of frustration here. Remember to restart the searchd daemon after you change configuration options! I had issues when I was getting garbage search results from the PHP API after changing sphinx.conf.

Now go and give your users an awesome search experience!

Comments (1)

code deployment with lftp and md5deep

After a long time of developing on a live server, the startup that I work for now has a development server as a result of getting funded. This is great but also necessitated the creation of a process to deploy code from test -> live.

I tackled this deployment issue by creating a few scripts which make special use of a couple nifty tools – lftp and md5deep.

lftp
Lftp is FTP on steroids. This tool, among it’s features, allows scripting and mirroring which proved very valuable here.

md5deep
This tool is md5sum on steroids. The built in md5sum is great for verifying file integrity which is a great thing to do after moving code from one server to another. However, I just couldn’t figure out how to make it recursive and automaitcally do sub-directories. This is when I found md5deep. md5deep CAN work recursively and it does so quite well.

The process goes a little something like the following:
The test server runs a script which . . .

  • backs up the code (tar + gzip) and slaps that backup into a backup directory
  • runs md5deep on the directory and produces a file
  • uses lftp to mirror that directory (including md5deep’s file) to a deployment directory on the live site

The live server has a script which . . .

  • does a similar backup but of the live directory
  • moves code from the deployment directory to the live directory
  • runs md5deep to verify the integrity of all the files using the file created on the development server

If md5deep gives the okay, we’re good to go. Otherwise we investigate any files that seem to vary from development to live.

For more information on these two tools:
lftp
md5deep

Leave a Comment

Abstract Classes Vs. Interfaces

These two are new to PHP5 and are part of a new set of features that greatly improves OOP in PHP. At first glance abstract classes and interfaces look very similar but they have key differences. I’ll briefly introduce both and then lay out the significant differences.

Abstract Classes
An abstract class is created as a parent class which can be used when extended. Instead of saying:

class messages
{
}

you would do this instead…
abstract class messages
{
}

In an abstract class, you can define properties and methods as you would normally with any class. The kicker is that you can add ‘abstract’ in front of your methods to just define that they are required in the child class and then the child class expands on those definitions. Kind of like a header file in C. Here’s an example:
abstract class messages
{
    var $msg;

    abstract public function showMessage();
    public function cleanMessage()

    public function setMessage($msg)
    {
        $this->msg = $msg;    
    }
}

Interfaces on the other hand, also allow you to define a “header” of sorts so that you define what properties and methods should be in any class that uses it BUT it doesn’t allow you to expand on it. Here’s an example:
interface messages
{
    var $msg;

    public function showMessage();
    public function cleanMessage()
    public function setMessage($msg);
}

From there, you can then start building a class that “implements” that interface and build out the actual guts of what the interface defines:
class myMessage implements messages
{
    var $msg = 'Default Message';

    public function showMessage()
    {
        // ....
    }
    public function cleanMessage()
    {
        // ....
    }
    public function setMessage($msg);
    {
        // ....
    }
}

Interfaces are great when you want to provide a kind of API that other classes must follow. Also, a class can implement multiple interfaces whereas you can only extend one abstract class. You can do this by separating each interface with a comma . . .
class myMessage implements messages, communication
{
    //.....
}

This is where the creative parts of a programmers brain must kick in . . . when do you use an abstract class and when does an interface make sense? This all depends on what the needs of your application are. Consider the differences between abstract classes and interfaces. And if that wasn’t enough to get your brain going . . . when is it best to just use a simple parent class vs. an abstract class.

I’d say keep these things in mind:

  • use a simple parent class when the child objects will share common properties and methods but then have a vast variety of it’s own properties & methods
  • use an abstract class when the child objects will both share common properties and methods but also have similar functionality that they perform in different ways
  • use interfaces when you simply want to define a rigid ‘API’ that the classes must follow

But all applications vary so you must do what will work best for your application.

And lastly . . . don’t spend a large chunk of your development time debating between these little intricacies. Speed is a competitive advantage. Make your choice and follow through with it.

Leave a Comment

One Salt Per Hashed Password

Intro
Hashing passwords are a great idea (Note: some people use the terms hashed and encrypted interchangeably). The basic premise is easy: make a hash of a password that only goes one way . . . password –> hashed string. You can’t go from hashed string to password. Then the process of logging in goes like this:

  • take a user-supplied username/password combination
  • run the same hash process on that password
  • take the result and and compare it to what you have in the database

Salt!
What could be even better? Adding a salt, of course! A salt is an additional string/characters that you add to the password before you make a hash of it. This increases the ‘randomness’ of the resulting hash. No longer is it made by just (for example) a weak password like ‘password’, but an extra salt is added to increase the strength.

Doing this, the process to login a user would be as follows:

  • take a user-supplied username/password combination
  • running the same hash process on that password, including the same salt
  • take the result and and compare it to what you have in the database

Even Better
Now, lets say hypothetically that an attacker gains access to your database full of hashed passwords. If they manage to figure out the hash and the salt, they have just gained access to all those passwords. An additional hurdle you can throw at a would-be attacker is making sure you use a different salt on each password. How? Well there’s many ways to do this and you can get creative. One common method is using something such as a username in the salt.

Now, the process would go like this:

  • take a user-supplied username/password combination
  • generate your “dynamic” salt using the person’s username. then use that salt and make a hash of the password
  • take the result and and compare it to what you have in the database

But What If . . .
What happens if the user changes their username? Wouldn’t that make it impossible to then generate the same salt originally used to hash the password? Well, yes. In this case, ask the user to also resubmit their password. Upon changing the username, regenerate the hash as well and store it in the database.

Interesting debate/discussion on this @ devnetwork.

Leave a Comment

Test Driven Development

What’s the best way to sleep at night and what’s the best way to be fairly sure that you’re code is as bullet proof as possible? Learn about test driven development and learn to love tests! There have been numerous articles that go in depth about this subject so I’ll {try} be brief about it…

No real purist will distill TDD into just a few guidelines but I will anyways:

  • Think about what functionality you want to build. Design and code a test that will test that functionality (before you code the actual functionality!).
  • Run the test and make sure it fails as expected.
  • Code your functionality.
  • Run your test again. Did it pass? Congrats, move on to the next task. Didn’t pass? You have some bug fixing to do.

The trick here is to put more emphasis on the design of the functionality than you’re used to. Think it through and design it so that it’s testable. If your test fails after coding the functionality and you feel compelled to fix the test rather than the functionality, then one of two things happened: 1) you did not give enough thought to the design of the functionality and make it easily testable or 2) you coded the test wrong. In either case, the design part of it all needed work.

Rinse, repeat, rinse, repeat.

The beauty in having a suite of tests is piece of mind. No, not just a warm fuzzy feeling. I’m talking about time-saving, bug-identifying, save-my-ass, look-ma-no-bugs piece of mind. Did you just add a new feature or refactor some code? Run your suite of tests. You’ll be able to much more quickly spot the bugs. Perhaps you’ll spot something that might have otherwise gone live.

I personally use SimpleTest to do my testing. PHPUnit is supposed to be pretty good as well.

Some links if you want to read more:
http://www.onpk.net/talks/fosdem2005/introduction_simpletest.html
http://www.developerspot.com/print/php/test-driven-development/

Comments (1)

Older Posts »
Follow

Get every new post delivered to your Inbox.