Manage Elasticsearch indices with elasticsearch-php library

I already wrote on this blog about elasticsearch-php library maintained by Elastic, the firm that created Elasticsearch. It's my first choice when I need to add Elasticsearch to a project. Unlike other bundles like Fos/ElasticaBundle, there is not an automatic life cycle management of indices in this library. You need to handle everything by yourself. Based on my experience of using Elasticsearch inside many Symfony projects, I'll show you how I manage my indices with some Symfony command.

Manage index name

Elasticsearch index changes over time, not only by his content but also his mapping. It's not recommanded to enable automapping to handle new fields in the index. But to have a strict mapping or discard unknown field. In this case to add a new field we have to update the mapping and it's better to create a new index too.

To avoid changing index name in our code everytime we create a new index, we should use another immutable name to link our index, this is the job of the Alias (I'll come back to it later). Good practice with index name is to stamp it with date time.
Using this blog as example, I have moncode as alias name and moncode_date_time as pattern for my indices' name.
I have a PHP class to manage name of my index.

<?php
declare(strict_types=1);
 
namespace Metfan\LibSearch\Index;
 
use \DateTimeImmutable;
use \DateTimeZone;
 
class IndexNameGenerator
{
    public function __construct(private string $indexPattern)
    {
    }
 
    public function generateName(): string
    {
        $now = new DateTimeImmutable('now', new DateTimeZone('UTC'));
 
        $replacement = [
            '<DATE>' => $now->format('Ymd'),
            '<TIME>' => $now->format('His'),
        ];
 
        return str_replace(
            array_keys($replacement), 
            array_values($replacement), 
            $this->indexPattern
        );
    }
 
    public function generateWildcardPattern(): string
    {
        $replacement = [
            '<DATE>' => '*',
            '<TIME>' => '*',
        ];
 
        return str_replace(
            array_keys($replacement), 
            array_values($replacement), 
            $this->indexPattern
        );
    }
}

The method generateWildcardPattern() is used to search all indices when I switch the alias or delete an index.

Connection to Elasticsearch

Most of the time I have a simple connection to Elasticsearch cluster with few options. I wrote a builder based on an interface to easily change when I need something more complex.

<?php
declare(strict_types=1);
 
namespace Metfan\LibSearch\Client;
 
use Elastic\Elasticsearch\Client;
use Elastic\Elasticsearch\ClientBuilder as EsClientBuilder;
 
class ClientBuilder implements ClientBuilderInterface
{
    private ?Client $client = null;
 
    public function __construct(private string $host, private string $port)
    {
    }
 
    public function build(): Client
    {
        if (null ==! $this->client) {
            return $this->client;
        }
 
        $this->client = EsClientBuilder::create()
            ->setHosts([$this->host.':'.$this->port])
            ->build();
 
        return $this->client;
    }
}

Create an index

Create an index is easy. It's just a POST request on the API with the name of the index and the mapping definition. I always use the same method to achieve that, I write the mapping in JSON format using Twig. I talked about that in this post where  I wrote Ealsticsearch request body using Twig files..

My first class allow me to create the request body using Twig.

<?php
declare(strict_types=1);
 
namespace Metfan\LibSearch\Request;
 
use Twig\Environment;
 
class RequestForgery
{
    public function __construct(
        private Environment $templating,
        private string $aliasName,
        private string $templateName
    ) {
    }
 
    /**
     * @param array<string, mixed>  $tplParams
     *
     * @return array{index: string, body: string}
     */
    public function forgeRequest(
        array $tplParams = [], 
        ?string $indexName = null
    ): array {
        return [
            'index' => $indexName?? $this->aliasName,
            'body' => $this->templating->render($this->templateName, $tplParams)
        ];
    }
}

Index creation in Elasticsearch uses the API, it's wrapped by Indices class in elasticsearch-php library.

<?php
declare(strict_types=1);
 
namespace Metfan\LibSearch\Index;
 
use Metfan\LibSearch\Client\ClientBuilderInterface;
use Metfan\LibSearch\Request\RequestForgery;
 
class IndexCreator
{
    public function __construct(
        private ClientBuilderInterface $builder,
        private RequestForgery $forgery,
        private IndexNameGenerator $indexNameGenerator
    ) {
    }
 
    public function createIndex(): string
    {
        $client = $this->builder->build();
 
        $indexName = $this->indexNameGenerator->generateName();
        $request = $this->forgery->forgeRequest([], $indexName);
        $client->indices()->create($request);
 
        return $indexName;
    }
}
 

Switch de l'alias

There is not a dedicated API to switch an alias in Elasticsearch. An alias can be used on one or more index so we should update indices setup.
I don't have daily indices, it's mainly usage for log, so I want my alias on one index at a time.
To achieve that I need to get indices list matching index pattern with wildcard. Then I have to write a query to add alias on an index and remove it from another index.

<?php
declare(strict_types=1);
 
namespace Metfan\LibSearch\Index;
 
use Metfan\LibSearch\Client\ClientBuilderInterface;
use Elastic\Elasticsearch\Response\Elasticsearch;
use Webmozart\Assert\Assert;
 
class IndexSwitcher
{
    public function __construct(
        private ClientBuilderInterface $clientBuilder,
        private IndexNameGenerator $indexNameGenerator,
        private string $aliasName
    ) {
    }
 
    public function switchIndex(?string $indexName = null): string
    {
        $indices = $this->clientBuilder->build()->indices();
 
        // get all indices from index pattern
        /** @var Elasticsearch $indicesAliasResponse */
        $indicesAliasResponse = $indices->getAlias(
            ['index' => $this->indexNameGenerator->generateWildcardPattern()]
        );
 
        Assert::isInstanceOf($indicesAliasResponse, Elasticsearch::class);
        $indicesAlias = $indicesAliasResponse->asArray();
 
        if (null === $indexName) {
            ksort($indicesAlias);
            $indexName = (string) key(array_reverse($indicesAlias, true));
        }
 
        $aliasName = $this->aliasName;
        //extract all indices with alias
        $currentIndicesAlias = array_keys(
            array_filter(
                $indicesAlias,
                function (array $aliases) use ($aliasName) {
                    return is_array($aliases['aliases']) 
                        && isset($aliases['aliases'][$aliasName]);
                }
            )
        );
 
        $request = [
            'body' => [
                'actions' => [
                    [
                        'add' => [
                            'index' => $indexName, 
                            'alias' => $aliasName,
                        ]
                    ],
                ]
            ]
        ];
        foreach ($currentIndicesAlias as $current) {
            $request['body']['actions'][] = [
                'remove' => [
                    'index' => $current, 
                    'alias' => $aliasName,
                ]
            ];
        }
 
        $indices->updateAliases($request);
 
        return $indexName;
    }
}

I know it's a big stack of code, it's due to Elasticsearch response is a big associative array.

Delete an index

Removing an index is simple, elasticsearch-php library have a delete() method on indices class.

<?php
declare(strict_types=1);
 
namespace Metfan\LibSearch\Index;
 
 
use Metfan\LibSearch\Client\ClientBuilderInterface;
use Metfan\LibSearch\Exception\IndexDeletionFailedException;
use Metfan\LibSearch\Exception\IndexDeletionUnauthorizedException;
use Metfan\LibSearch\Exception\IndexNotFoundException;
use Elastic\Elasticsearch\Response\Elasticsearch;
use Webmozart\Assert\Assert;
 
class IndexRemover
{
    /** @var array<string, array{name: string, with_alias: bool}>  */
    private array $indicesList = array();
 
    public function __construct(private ClientBuilderInterface $clientBuilder)
    {
    }
 
    /**
     * @return array<string, array{name: string, with_alias: bool}>
     */
    public function getIndicesList(string $indexPattern): array
    {
        $indices = $this->clientBuilder->build()->indices();
        $list = $indices->get(['index' => $indexPattern]);
 
        Assert::isInstanceOf($list, Elasticsearch::class);
 
        foreach ($list->asArray() as $indexName => $config) {
            if (isset($config['aliases']) and !empty($config['aliases'])) {
                $this->indicesList[$indexName] = [
                    'name' => $indexName, 
                    'with_alias' => true,
                ];
                continue;
            }
 
            $this->indicesList[$indexName] = [
                'name' => $indexName, 
                'with_alias' => false,
            ];
        }
 
        return $this->indicesList;
    }
 
    public function remove(string $indexName): void
    {
        if (empty($this->indicesList)) {
            $this->getIndicesList($indexName);
        }
 
        if (!isset($this->indicesList[$indexName])) {
            throw new IndexNotFoundException($indexName);
        }
 
        if (true === ($this->indicesList[$indexName]['with_alias'])) {
            throw new IndexDeletionUnauthorizedException($indexName);
        }
 
        $indices = $this->clientBuilder->build()->indices();
        $response = $indices->delete(['index' => $indexName]);
 
        Assert::isInstanceOf($response, Elasticsearch::class);
 
        if (200 !== $response->getStatusCode()) {
            throw new IndexDeletionFailedException($indexName);
        }
    }
}
 
 

You should be cautious to not delete the index with alias or you'll lose search engine capability on the website.

This is all the minimum code you need to manage index life cycle with Elasticsearch-php library. I publish all the code with some Symfony command to use it on github.
In next post I talk about indexing document in Elasticsearch.

Add a comment