Skip to main content

Code snippet: Getting all Drupal nodes without a URL alias

Nov 5, 2018

Sometimes may happen that a few of your thousands of nodes do not have URL aliases. Obviously, you can regenerate paths using the pathauto, but it may cause damage changing the existing URLs. Also, if you have hundreds of thousands of nodes, the update may take quite a while.

You can get the list un-aliased nodes executing the following code. You can easily do this on a development site, using "devel/php" page.

$query = db_select('url_alias', 'u');
$query->fields('u', array('source'));
$result = $query->execute();
$urls = $result->fetchCol();

$query = db_select('node', 'n');
$query->fields('n', array('nid'));
$result = $query->execute();
$nids = $result->fetchCol();

$unaliased = array();
foreach($nids as $nid) {
  if (array_search('node/' . $nid, $urls) !== FALSE) { 
    $unaliased[] = $nid;
  }
}

dsm($unaliased);

UPD. Another solution suggested by Dima Sukharchuk:

$query = db_select('node', 'n');
$query->addJoin('LEFT OUTER', 'url_alias', 'a', 'a.source = CONCAT(:arg, n.nid)', [':arg' => 'node/']);
$query->addField('n', 'nid');
$query->isNull('a.pid');
$nids = $query->execute()->fetchCol();

Updated solution should be much faster, since there is no need to execute two queries and run a loop.

Drupal 8 snippet is not much different from a D7 version. Though, I am providing it here in full in case someone wants to save the time adapting the query for Drupal 8 syntax.

use Drupal\Core\Database\Database;

$db_connection = Database::getConnection('default');

$query = $db_connection->select('url_alias', 'u');
$query->fields('u', array('source'));
$result = $query->execute();
$urls = $result->fetchCol();

$query = $db_connection->select('node', 'n');
$query->fields('n', array('nid'));
$result = $query->execute();
$nids = $result->fetchCol();

$unaliased = array();
foreach($nids as $nid) {
  if (!array_search('node/' . $nid, $urls)) {
    $unaliased[] = $nid;
  }
}

dsm($unaliased);

Unfortunately, I was not able to pack everything into a single SQL query, because of the lack of entity ID field. The 'url_alias' table simply does not contain this column. Joining with concatenation does not sound like a feasible solution for me either.

I hope this short snippet will help someone and save 10-20 minutes of debugging.