Previous Section  < Day Day Up >  Next Section

10.2 Reading and Writing Entire Files

This section shows you how to work with an entire file at once, as opposed to manipulating just a few lines of a file. PHP provides special functions for reading or writing a whole file in a single step.

10.2.1 Reading a File

To read the contents of a file into a string, use file_get_contents( ). Pass it a filename, and it returns a string containing everything in the file. Example 10-1 reads the file in Example 10-2 with file_get_contents( ), modifies it with str_replace( ), and then prints the result.

Example 10-1. Using file_get_contents( ) with a page template
// Load the file from Example 10.2

$page = file_get_contents('page-template.html');



// Insert the title of the page

$page = str_replace('{page_title}', 'Welcome', $page);



// Make the page blue in the afternoon and

// green in the morning

if (date('H' >= 12)) {

    $page = str_replace('{color}', 'blue', $page);

} else {

    $page = str_replace('{color}', 'green', $page);

}



// Take the username from a previously saved session

// variable

$page = str_replace('{name}', $_SESSION['username'], $page);



// Print the results

print $page;

Example 10-2. page-template.html for Example 10-1
<html>

<head><title>{page_title}</title></head>

<body bgcolor="{color}">



<h1>Hello, {name}</h1>



</body>

</html>

Every time you use a file access function, you need to check that it didn't encounter an error because of a lack of disk space, permission problem, or other failure. Error checking is discussed in detail later in Section 10.6. The examples in the next few sections don't have error-checking code, so you can see the actual file access function at work without other new material getting in the way. Real programs that you write always need to check for errors after calling a file access function.


With $_SESSION['username'] set to Jacob, Example 10-1 prints:

<html>

<head><title>Welcome</title></head>

<body bgcolor="green">



<h1>Hello, Jacob</h1>



</body>

</html>

A local file and a remote file look the same to file_get_contents( ). If you pass a URL to file_get_contents( ), it reads the web page at that URL. Example 10-3 retrieves a weather report from the U.S. National Weather Service. It uses strpos( ) and substr( ) to scoop out and print just the part of the page that contains the forecast for the upcoming week.

Example 10-3. Retrieving a remote page with file_get_contents( )
$zip = 98052;



$weather_page = file_get_contents('http://www.srh.noaa.gov/zipcity.php?inputstring=' . 

$zip);



// Just keep everything after the "Detailed Forecast" image alt text

$page = strstr($weather_page,'Detailed Forecast');

// Find where the forecast <table> starts

$table_start = strpos($page, '<table');

// Find where the <table> ends

// Need to add 8 to advance past the </table> tag

$table_end  = strpos($page, '</table>') + 8;

// And print a slice of $page that holds the table

print substr($page, $table_start, $table_end - $table_start);

Obviously, what the weather is going to be in the coming days varies constantly, but Example 10-3 prints something like:

<table cellspacing="0" cellpadding="3" border="0" width="326">

        <tr>

        <td><a name="contents"></a> <b>Today</b>. Numerous showers developing by 

noon. A chance of afternoon

thunderstorms. Highs in the mid 50s. Southwest wind 10 to 15 mph. <br><br>

<b>Tonight</b>. Numerous showers and chance of thunderstorms in the

evening. Then mostly cloudy. Lows near 40. Southwest wind near 10

mph. <br><br>

<b>Friday</b>. Partly cloudy. A chance of afternoon showers. Highs in the

mid to upper 50s. South wind near 10 mph shifting to the west in the

afternoon. <br><br>

<b>Friday night</b>. Partly cloudy. A chance of evening showers. Lows in

the upper 30s. Light wind. <br><br>

<b>Saturday</b>. Partly cloudy. A chance of afternoon showers. Highs in

the mid 50s. Southwest wind near 10 mph in the morning becoming

light. <br><br>

<b>Saturday night</b>. Partly cloudy. A chance of evening showers. Lows

in the mid 30s. <br><br>

<b>Sunday</b>. Partly sunny. Highs in the upper 50s. <br><br>

<b>Sunday night</b>. Partly cloudy. Lows in the upper 30s. <br><br>

<b>Monday</b>. Partly sunny. Highs in the lower 60s. <br><br>

<b>Monday night</b>. Partly cloudy. Lows in the lower 40s. <br><br>

<b>Tuesday</b>. Mostly cloudy. A chance of rain. Highs in the lower 60s. <br><br>

<b>Tuesday night</b>. Mostly cloudy. A chance of rain. Lows in the lower

40s. <br><br>

<b>Wednesday</b>. Mostly cloudy. A chance of rain. Highs in the upper

50s. <br>&&

               temperature      /     precipitation

gold bar          54   40   56  /  50   50   40

enumclaw          55   39   56  /  60   60   40

north bend        56   40   57  /  60   60   40

<br><br>

</td>

        </tr>

        </table>

Retrieving a remote URL and slicing out a chunk of it for your use is called screen scraping. It's a popular and easy way to incorporate remote data sources into your programs. There are two things to be concerned with, though, when you engage in scraping.

First, screen scraping can be fragile. The slightest changes in page structure can break your carefully tuned string parsing. If the National Weather Service decides to change the HTML around their Short Term Forecast, then Example 10-3 might no longer parse the page correctly. (Perhaps this has already happened since this paragraph was written!)

The second issue with screen scraping is its propriety. The National Weather Service explicitly puts its information in the public domain, but most web sites don't. Before you scrape another site and incorporate its content into your own, be sure that you have permission to do so.

For in-depth screen scraping, consider using regular expressions. With the pattern-matching power of a regular expression, you can flexibly carve up a retrieved web page. Regular expressions are helpful for screen-scraping tasks such as extracting all the links from a page or pulling the content out of individual HTML table cells; you will learn about them in Appendix B.

10.2.2 Writing a File

The counterpart to reading the contents of a file into a string is writing a string to a file. And the counterpart to file_get_contents( ) is file_put_contents( ). Example 10-4 extends Example 10-3 by saving the short term weather forecast in a local file in addition to printing it.

Example 10-4. Saving a file with file_put_contents( )
$zip = 98052;



$weather_page = file_get_contents('http://www.srh.noaa.gov/zipcity.php?inputstring=' . 

$zip);



// Just keep everything after the "Detailed Forecast" image alt text

$page = strstr($weather_page,'Detailed Forecast');

// Find where the forecast <table> starts

$table_start = strpos($page, '<table');

// Find where the <table> ends

// Need to add 8 to advance past the </table> tag

$table_end  = strpos($page, '</table>') + 8;

// And get the slice of $page that holds the table

$forecast = substr($page, $table_start, $table_end - $table_start);

// Print the forecast;

print $forecast;

// Save the forecast to a file

file_put_contents("weather-$zip.txt", $forecast);

Example 10-4 writes the value of $forecast (the weather forecast) to the file weather-98052.txt. The first argument to file_put_contents( ) is the filename to write to, and the second argument is what to write to the file.

Just like file_get_contents( ) accepts a URL to read a remote file, file_put_contents( ) accepts a URL to write a remote file. The kinds of URLs that are acceptable to file_put_contents( ) are more limited, however. Not all kinds of remote servers allow you to write files. Usually, you can only write a remote file via an FTP URL, and the FTP server involved must grant the appropriate permissions. Example 10-5 constructs a templated page as in Example 10-1, and then uses file_put_contents( ) to save the page on a remote server via FTP.

Example 10-5. Saving a remote file via FTP with file_put_contents( )
// Load the file from Example 10.2

$page = file_get_contents('page-template.html');



// Insert the title of the page

$page = str_replace('{page_title}', 'Welcome', $page);



// Make the page blue in the afternoon and

// green in the morning

if (date('H' >= 12)) {

    $page = str_replace('{color}', 'blue', $page);

} else {

    $page = str_replace('{color}', 'green', $page);

}



// Take the username from a previously saved session

// variable

$page = str_replace('{name}', $_SESSION['username'], $page);



// Instead of printing the results, save the page on a 

// remote FTP server

file_put_contents('ftp://bruce:hax0r@ftp.example.com/usr/local/htdocs/welcome.html', 

$page);

In Example 10-5, the FTP URL passed to file_put_contents( ) means "log in to ftp.example.com with username bruce and password hax0r, and write to the file /usr/local/htdocs/welcome.html."

    Previous Section  < Day Day Up >  Next Section