Example program to search Write.as/WriteFreely posts from API JSON data

A technique exists for searching public write.as or writefreely blogs that is documented in this post. However, if you want to search your anonymous posts and/or private blogs, the following technique using the API works well for that purpose.

Recently I began to look into the API features of Write.as and WriteFreely, specifically to use the API to get the content of all my blog posts and anonymous posts in JSON format for the purpose of doing compound searches on this data.

My intention was to periodically download the data to my website using the API and to code one or a series of custom PHP programs to query it from a web page.

The code below is an example showing the API command along with a simple program to do a compound search of the data.

The PHP program illustrates navigating the JSON data and can be used to query one or multiple user instances. The data from a user's instances may include multiple blogs (whether private or published) and any number of anonymous posts.

The program output lists only those posts that have a match to your input keywords. The post title is a hyperlink that takes you directly to the post. There is also a link to the individual blog or anonymous posts page.

This code is freely available, as is, to anyone who may wish to use or modify it, but no support is provided for it. I am not an expert-level PHP programmer and this was a quick coding job, so I don't consider it to be particularly elegant coding. But hey, it works. I have this installed on my web server (Debian 10, Apache 2.4, and PHP 7.3). It could be installed on your local machine as well, and I've tested it in Windows 10 with XAMPP. It probably would also work on MacOS provided you have a web server with PHP.

The JSON files generated by the API need to be downloaded for each user Write.as or WriteFreely instance before running this program. Here's a curl command example for getting all user posts via the API:

curl "https://write.as/api/me/posts" \
   -H "Authorization: Token 00000000-1111-2222-3333-444444444444" \
   -H "Content-Type: application/json" \
   -X GET  >writeas_api_posts.json

If you don't yet have an authorization token, see this to get one: https://developers.write.as/docs/api/#authenticate-a-user

The beginning of the PHP program has a Required Input section where you need to specify your data files, the urls to your Write.as and/or WriteFreely instance(s)1, and a flag involving markdown output. See the code for more info. You don't need to specify your individual blogs. Based on your auth code, the API will take care of downloading all your blogs and anonymous posts. The urls are used only to construct the hyperlink to the matched posts – they are not used to retrieve data, the data comes only from the API.

1 As an example, I included two WriteFreely services, https://qua.name and https://wordsmith.social. Note that if you have custom domain names for your instances, you would of course use those.

Here's an image clip of an input form and output example:

wawf-search-clip

Here is the PHP code. For display convenience this includes all the HTML, CSS, and PHP code in one file. A more complex example would use separate files.

Note: If you have a Write.as and/or WriteFreely blog and want want to try this out but don't have access to a server running PHP, it can be run on glitch.com. Instructions are provided here by CJ Eller, who has packaged this into a convenient, easy-to-use glitch app. General glitch.com Help topics are here. For info regarding public vs private visibility of the JSON data files in the glitch app, see this post.

<?php
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);

/*======================================================================================
  REQUIRED INPUT:
========================================================================================
  Specify the filenames for your posts API data, your write.as and/or writefreely blogs, 
    and for the $md_ext value enter either '' or '.md' depending on what's needed for the 
    anonymous post files to be rendered in markdown (write.as anon posts typically need 
    the '.md' whereas writefreely anon posts typically do not).
  Modify this associative array for your situation. Include a trailing forward slash, /,
    in the $href element. Add a folder path prefix to the $fn filename if needed.
  The API data files should be in the same folder as this program.  
========================================================================================*/
$blog[0] = array("fn" => "wa-posts.json", "href" => "https://write.as/", "md_ext" => '.md');
$blog[1] = array("fn" => "qua-posts.json", "href" => "https://my.writefreely.instance/", "md_ext" => '');
$blog[2] = array("fn" => "wordsmith-posts.json", "href" => "https://wordsmith.social/", "md_ext" => '');
/*======================================================================================*/
    
/*======================================================================================
  The API data JSON files need to be downloaded before running this program. Here's a 
  curl command example for getting all user posts via the write.as or writefreely API:
    
  curl "https://write.as/api/me/posts" \
    -H "Authorization: Token 00000000-1111-2222-3333-444444444444" \
    -H "Content-Type: application/json" \
    -X GET  >writeas_api_posts.json
    
  If you don't yet have an auth token, see this to get one:
  https://developers.write.as/docs/api/#authenticate-a-user
======================================================================================*/

// IMPORTANT NOTES:
//
// For anonymous posts, in order for the Title field to be populated in the API data,
// the first line of the post MUST start with a single # and a space. Otherwise, the 
// Title field will be blank.
//
// For the names of the elements provided by the API data, see bottom of this file
?>

<!DOCTYPE HTML>
<html>
<head>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  <title>WaWf API Data Search</title>
  <style type="text/css">
    body {
      font-family: sans-serif, helvetica;
      font-size: 15px;
      line-height: 1.3;
      margin: 15px;
    }
    a {
       color: blue;
       text-decoration: none;
     }
     a:hover {
       color: #AA08AA; 
       font-weight: bold;
       text-decoration: underline;
     }
  </style>
</head>
<body> 
<?php
//==========================================
// Main part of program
//==========================================
$msg = 'Search these files: ';
$i = 0;
foreach($blog as $item) {
  $fn = $item['fn'];
  if (!file_exists($fn)) {
    echo "<span style=\"color:red;\"><b>Error: File $fn not found; please check and either correct filename or remove it from the list (i.e., the relevant " . '$blog[] entry in this program)</b></span>';
    exit;
  } else {
  $msg .= $fn . ", ";
  }
  $href = $item['href'];
  if (substr($href, -1) !== '/') $blog[$i]['href'] = $href . '/';
}  
echo substr($msg, 0, -2) . "<br><br>";
$search1=''; $search2=''; $search3=''; $butnot=''; $bOption=''; $nq='';

if (isset($_POST['submit'])) {
  $search1 = $_POST['searchkey1'];
  $search2 = $_POST['searchkey2'];
  $search3 = $_POST['searchkey3'];
  $butnot =  $_POST['butnot'];
  $bOption = $_POST['bOption'];
  $uri = $_SERVER['REQUEST_URI'];
  $nq = '<a href="' . $uri . '">New Query</a><br>';
}

print '<form name="myform" action="' . $_SERVER['PHP_SELF'] . '" method="post">
	Keyword1: (&nbsp;<input name="searchkey1" size="15" autofocus value="' . $search1 . '">&nbsp;<b>AND</b>
	Keyword2: <input name="searchkey2" size="15" value="' . $search2 . '">&nbsp;)&nbsp;
 	<select name="bOption">';
if ($bOption == 'OR') {
  print	'<option value="AND">AND</option><option value="OR" selected>OR</option>';
} else {
  print	'<option value="AND" selected>AND</option><option value="OR">OR</option>';
}  
print	'</select> 
	Keyword3: <input name="searchkey3" size="15" value="' . $search3 . '">&nbsp;
	But Not: <input name="butnot" size="15" value="' . $butnot . '">&nbsp;
	<input type="submit" name="submit" value="Go">&nbsp;&nbsp;' . $nq . '</form>';
if (!isset($_POST['submit'])) exit;

// The "submit" variable exists, so the form has been submitted - process form input and display results

// The API data file can have multiple blogs along with anonymous posts and they may be all mixed together with no 
// separation and no sort order. Therefore, if you want to have all anon posts and each blog's posts together in grouped
// sections, a sorting mechanism is needed. We sort on the url (which groups each blog's posts together') and to handle 
// the anon posts, we add a class="anon" to the url so they'll be grouped together when sorted.

if ($search1 == "" && $search2 == "") {
  echo '<br><b><span style="color:red;">You must input at least one search keyword in the first 2 input boxes</span></b>';
  exit;
}

$iMatches = 0;
foreach($blog as $item) {
  $fn = $item['fn'];
  $href = $item['href'];
  $md_ext = $item['md_ext'];
  $json = file_get_contents($fn);
  $array = json_decode($json, true);
  $blogposts = $array['data'];
  $lines='';
  
  foreach ($blogposts as $post) {  
    $id    = $post['id'];
    $slug  = $post['slug'];
    $title = $post['title'];
    // title might be blank; if so, substitute the slug if it's not also blank, otherwise use the post id
    if (trim($title) == '') {
      if (trim($slug) !== '') {
        $title = $slug;
      } else {
        $title = $id;
      }
    }
    $tags  = $post['tags'];  // note $tags is an array
    $body  = $post['body'];
    $url = '<a href="' . $href . $id . '.md" target="_blank">' . "<b>$title | $slug</b></a><br>\n"; 
    //Check title, tags, and body of blog post for keyword matches - entire post vs individual lines?
    $contents = $title . '||' . implode(" ", $tags) . '||' . $body; //use || separator so don't have false match possibility due to a run-on 
    if (isMatched($contents)) {
      // Matched
      if (isset($post['collection'])) {
        // Blog post
        $coll = $post['collection'];
        $cAlias = $coll['alias'];
        $blogUrl = '<a href="' . $href .  $coll['alias'] . '" target="_blank">' . $cAlias . "</a>"; 
        $url = '<a href="' . $href . $cAlias . '/' . $slug . '" target="_blank">' . "<b>$title</b></a>"; 
        $lines .=  substr($cAlias,0,3) . "$blogUrl | $url<br>\n";
      } else {
        // Anonymous posts
        // They typically have a null slug (possible exception is if they've been moved from blog post to anonymous post)
        $url = '<a href="' . $href . $id . $md_ext . '" target="_blank">' . "<b>$title</b></a><br>\n";
        $lines .= 'ano' . '<a class="anon" href="' . $href . 'me/posts">Anonymous</a>' . " | $url<br>\n";   //class="anon" needed for sort
      }
      $iMatches += 1;
    }  
  } // end foreach blog post
  
  if ($iMatches == 0) {
    echo "<br>No matches were found for the search criteria";
    exit;
  }
  
  // $lines contains each blog post. Sort them so all anon posts and separate blog posts group together.
  $posts = explode("\n", $lines);
  sort($posts, SORT_NATURAL | SORT_FLAG_CASE);
  $prev = '';
  $i = 1;
  $itot = 0;
  foreach($posts as $p) {
    if(strlen($p) > 4) {
      if (substr($p,0,3) !== $prev) {
        echo "<br>\n";
        $i = 1;
      }  
      echo str_pad($i, 2, '0', STR_PAD_LEFT) . ": " . substr($p,3) . "\n";
      $i++;
      $itot += 1;
      $prev = substr($p, 0,3);
    }
  }
} // end foreach blog

echo "<br>$iMatches matches were found for the search criteria";

function isMatched($contents) {
  global $search1;
  global $search2;
  global $search3;
  global $butnot;
  global $bOption;  //applies solely to the 3rd searchkey input

  if (stripos($contents, $butnot) !== false) return false;  
  if ($bOption == 'OR' && $search3 > '' && stripos($contents, $search3) !== false) return true;
 
  // The ButNot and OR criteria were handled above, so now we only need to check search 
  // term 1 by itself if it's the only input or in cobmination with any term 2 and 3 input
   if ($search1 == "" && $search2 !== "") {
  	$search1 = $search2;
  	$search2 = "";
  }	
   if ($search2 == "" && $search3 !== "" && $bOption == "AND") {
  	$search2 = $search3;
  	$search3 = "";
  }	
  // By the program logic, search1 is guaranteed to be be populated
  // So if there is no search1 match and since only AND logic now applies here, return false
  if (stripos($contents,$search1) === false) return false;
  // We now know that there is a $search1 match
  if ($search2 == '' && $search3 == '') return true;
  if ($search2 > '' && stripos($contents, $search2) !== false  && $search3 == '') return true;
  if ($search2 > '' && stripos($contents, $search2) !== false  && $search3 > '' && stripos($contents, $search3) !== false) return true;
  return false;
}

/*
Example elements in the JSON API data
  "id": "xxxxxxxxxx",
  "slug": "confidence-level-contours",
  "appearance": "sans",
  "language": "en",
  "rtl": false,
  "created": "2019-11-03T15:46:58Z",
  "updated": "2019-11-03T16:09:50Z",
  "title": "Confidence level contours",
  "body": "Some body text",
  "tags": [],
  "images": [
     "https://i.snap.as/xxxxxxx.jpg",
     "https://i.snap.as/yyyyyyy.jpg",
   ],
  "views": 0,
  "collection": {
     "alias": "MyBlogName",
     "title": "My Blog Title",
     "description": "Primarily About this, that, and the other",
     "style_sheet": "",
     "public": false,
     "views": 22,
     "total_posts": 0
     }
*/

?>
</body>
</html>