PHP fgetss() Function

PHP

PHP fgetss() - Get Line with HTML Stripped

In this tutorial, we'll explore the fgetss() function in PHP β€” a handy way to read lines from a file while stripping out all HTML and PHP tags. This function is particularly useful when you need to extract clean, readable text from files that may contain HTML markup or embedded PHP code. Whether you're processing logs, cleaning up imported HTML files, or preparing text for display, mastering fgetss() will help you safely read file content without unwanted tags.

Prerequisites

  • Basic knowledge of PHP programming
  • Understanding of file handling in PHP
  • PHP installed on your system (PHP 5.0.0 or later)
  • Access to a text editor or IDE to write PHP scripts

Setup Steps

  1. Create or obtain a text file containing HTML and PHP tags (e.g., sample.html).
  2. Write a PHP script that opens the file using fopen().
  3. Use fgetss() to read each line, automatically stripping tags.
  4. Display or process the cleaned text output as needed.

Understanding PHP fgetss() Function

The fgetss() function reads a single line from a file pointer and strips all HTML and PHP tags from that line. It is similar to fgets() but with the added tag-stripping feature.

Function signature:

string|false fgetss(resource $handle, int $length = 0, string|null $allowable_tags = null)
  • $handle: The file pointer returned by fopen().
  • $length: Optional. Maximum length to read (including the trailing line ending). If omitted or zero, it reads up to the end of the line.
  • $allowable_tags: Optional. String which specifies tags which will not be stripped. Example: "<b><i>".

Returns the line with stripped tags or false on EOF or error.

Example 1: Basic Usage of fgetss()

Let's create a simple script to read and clean each line from a sample HTML file:

<?php
// Open the file for reading
$handle = fopen("sample.html", "r");

if ($handle) {
    while (($line = fgetss($handle)) !== false) {
        echo $line . "<br>\n";
    }
    fclose($handle);
} else {
    echo "Failed to open file.";
}
?>

Explanation:
- We open sample.html in read mode.
- Inside the loop, fgetss() reads one line at a time and strips out any HTML and PHP tags.
- The cleaned line is printed with a line break.
- The file handle is closed afterward.

Example 2: Using $allowable_tags Argument

You can preserve certain tags when stripping by specifying them with the $allowable_tags argument. Here's how:

<?php
$handle = fopen("sample.html", "r");
if ($handle) {
    // Allow  and  tags to remain
    while (($line = fgetss($handle, 4096, "<b><i>")) !== false) {
        echo $line . "<br>\n";
    }
    fclose($handle);
} else {
    echo "Cannot open file.";
}
?>

This will keep <b> and <i> tags in the output while stripping all other tags.

Best Practices

  • Always check if the file opened successfully. Trying to read from a failed fopen() resource will trigger errors.
  • Specify a reasonable $length if dealing with very large lines. It helps with memory management.
  • Use the $allowable_tags parameter judiciously. Only permit tags you explicitly want to preserve.
  • Always close file handles after finishing. Use fclose() to release resources.
  • Be aware that fgetss() strips PHP tags too. If you want to preserve PHP tags, consider alternative approaches.

Common Mistakes

  • Assuming fgetss() strips tags from the entire file at onceβ€”it reads and strips line by line.
  • Not handling the case where fgetss() returns false (EOF or error).
  • Passing an invalid file handle to fgetss() causing runtime errors.
  • Forgetting to close the file handle, leading to resource leaks.
  • Using fgetss() on binary or non-text files, which can produce unexpected results.

Interview Questions

Junior Level

  • Q1: What does the PHP function fgetss() do?
    A: It reads a line from a file and strips all HTML and PHP tags from the line.
  • Q2: What parameters does fgetss() accept?
    A: It accepts a file handle, an optional length to read, and an optional string of allowable tags.
  • Q3: How is fgetss() different from fgets()?
    A: fgetss() strips HTML and PHP tags from the line read, while fgets() does not.
  • Q4: How would you read a file line by line stripping HTML tags using fgetss()?
    A: Use a loop with fgetss() on the file handle until it returns false.
  • Q5: Can you specify tags to allow in fgetss()? How?
    A: Yes, by passing the allowed tags as the third argument, e.g., "<b><i>".

Mid Level

  • Q1: What happens if you pass zero as the length parameter in fgetss()?
    A: It reads the entire line until a line break or EOF.
  • Q2: How does fgetss() handle PHP tags? Are they stripped by default?
    A: Yes, PHP opening and closing tags are stripped by default.
  • Q3: Is fgetss() safe to use with binary files? Why or why not?
    A: No, because binary data may not have line breaks and tags; the function expects text files.
  • Q4: How can you improve performance when using fgetss() on large files?
    A: Specify a length limit to avoid reading extremely long lines in memory at once.
  • Q5: If you want to preserve some tags, but not all, what should you be cautious about?
    A: Ensure the allowable tags argument only includes safe and desired tags to avoid XSS or unexpected formatting.

Senior Level

  • Q1: How could you implement similar functionality as fgetss() using fgets() and other functions?
    A: Read the line with fgets() and then use strip_tags() to remove unwanted tags.
  • Q2: Why might you prefer fgetss() over reading the entire file and then processing it?
    A: It reduces memory usage by processing the file line-by-line and stripping tags on-the-fly.
  • Q3: Discuss potential security concerns when using the $allowable_tags parameter.
    A: Allowing unsafe tags could lead to XSS vulnerabilities if output is displayed in browsers; sanitize or validate allowed tags carefully.
  • Q4: How does fgetss() interact with multibyte encodings like UTF-8?
    A: It is byte-oriented and might break multibyte characters if the length parameter splits multibyte sequences; careful handling is needed.
  • Q5: Given that fgetss() is deprecated as of PHP 7.3.0, what modern alternatives could you use?
    A: Use fgets() combined with strip_tags() for similar effect, or use DOM parsing libraries for complex HTML.

FAQ

Is fgetss() still recommended to use in modern PHP?
No, fgetss() has been deprecated since PHP 7.3.0. It is recommended to use fgets() with strip_tags() as a replacement.
What is the difference between strip_tags() and fgetss()?
fgetss() reads from a file and strips tags on-the-fly line by line, while strip_tags() is used on strings to remove HTML/PHP tags.
Can I use fgetss() to read an entire HTML file at once?
No, fgetss() reads one line at a time. To read entire content, use file_get_contents() then process with strip_tags().
What happens if I use fgetss() on binary files like images?
It can return unpredictable results and is not suitable for binary files since it expects text data.
How can I preserve only certain HTML tags when reading a file?
Use the third argument of fgetss() to specify allowable tags, e.g. "<b><i>", though a safer approach is to read normally and selectively process content.

Conclusion

The PHP fgetss() function provides a straightforward way to read lines from a file while stripping out unwanted HTML and PHP tags, making it ideal when clean text extraction is needed. Despite being deprecated in newer PHP versions, understanding its behavior helps grasp PHP text processing fundamentals. For modern code, combining fgets() with strip_tags() is recommended. Always remember to handle files carefully β€” check opening success, specify limits, close handles, and validate any allowable tags to maintain security and performance.