Elasticsearch

From development to production

🎨

@damienalexandre

@damienalexandre JoliCode

  • 🐘 Senior PHP consultant
  • πŸ”Ž Elasticsearch expert and trainer
  • πŸ‘΄πŸ½ Using Symfony since version 1.1
  • #bike #vegetarian #metal #emoji #beer

The basics

Elasticsearch

  • Just a tool, not a solution
  • Like a NoSQL database / document oriented
  • Distributed Lucene index on steroids
  • No transaction, no relation
  • Lots of complexity

Alternative search

  • Solr: based on Lucene too
  • MeiliSearch or Sonic: Rust powered and simple
  • Algolia: Search as a Service
  • Elastic App Search: same πŸ‘†πŸ½
  • FULLTEXT indices on your RDMS tables...

Is Elasticsearch really for you?

Usages

  • πŸ”Ž Full-text search
  • πŸ“š NoSQL database
  • πŸ“ƒ Logs storage and analysis
  • πŸ“Š Statistics
  • 🀯 Machine Learning, application performance monitoring, dashboard...

Do not use for

  • πŸ”₯ Session, stock, cache storage
  • πŸ’Ύ Key Value store
  • πŸ’½ Primary Data Store
  • πŸ“ƒ Transactional

Cluster, Node and Shard

Elasticsearch glossary

From PHP

Elasticsearch with PHP Elasticsearch glossary

HTTP and JSON

  • Everything is native in PHP:
    • HTTP: \file_get_contents()
    • JSON: \json_decode()
  • No extension, no worries.

The simple way


                $results = json_decode(
                    file_get_contents('http://localhost:9200/_search')
                );

We can do better!

PHP Packages

Existing tools

  • elasticsearch/elasticsearch = Official
  • ruflin/elastica
  • friendsofsymfony/elastica-bundle
  • madewithlove/elasticsearcher (5.x)
  • ongr/elasticsearch-dsl
  • doctrine/search Surprise! πŸ€ͺ

Official client

Low level, knows all the API, associative array:

$params = [
    'index' => 'app',
    'body'  => [
        'query' => [
            'bool' => [
                'must' => [
                    'match' => [ 'framework' => 'symfony' ]
                ]
            ]
        ]
    ]
];

$response = $client->search($params);
            

Elastica

Object oriented, based on the official client


$bool = new BoolQuery();

$bool->addMust(
    new Match('framework', 'symfony')
);

$response = $client->search($bool);

FOSElasticaBundle

Bridge between Doctrine and Elastica


fos_elastica:
    indexes:
        app:
            persistence:
                driver: orm
                model: App\Entity\Product
$ bin/console fos:elastica:populate
Library Pros Cons
Official Client
  • Great documentation
  • Active development
  • Always up to date
  • Associative arrays
  • Low level
Elastica
  • Awesome Objects
  • Active development
  • Documentation
FOSElasticaBundle
  • No code needed
  • Easy to use
  • Battery included
  • Strong Symfony integration
  • Opinionated
  • Release delay
  • Hard to customize / bend to your needs
  • Opinionated

My recommendation is

❀️ Elastica ❀️

  • Official Client: Too much work and working with deep associative array
  • FOSElasticaBundle: Implementation that can be hard to bend to custom needs

Avoid deep associative arrays

Toilet Paper Array

Indexing implementation

Indexing a document

with Elastica


$index = $client->getIndex('app');

$id = '42';
$content = ['name' => 'hans', 'likes' => ['2', '3']];

$doc = new Document($id, $content);

$index->addDocuments([$doc]); // Send a Bulk
            

Associative array

Let's use DTO
for the data!

Data transfer object

Another way

with Elastica


 $index = $client->getIndex('app');

 $id = '42';
-$content = ['name' => 'hans', 'likes' => ['2', '3']];
+$content = '{"name": "hans", "likes": ["2", "3"]}';

 $doc = new Document($id, $content);

 $index->addDocuments([$doc]); // Send a Bulk
            

DTO to JSON


$user = new User();
$user->name = 'hans';
$user->likes = ['2', '3'];
            

πŸ‘†πŸ½ from and to πŸ‘‡πŸ½


                {"name": "hans", "likes": ["2", "3"]}
            

DTO to JSON to DTO

  • symfony/serializer
  • JMSSerializer
  • Jane...

We are also going to deserialize for our search result!

Symfony Serializer


$serializer = $container->get('serializer');

$user = new User();
$user->name = 'hans';
$user->likes = ['2', '3'];

$doc = new Document(
    43,
    $serializer->serialize($user, 'json')
);

$index->addDocuments([$doc]);
            

Jane: PHP Code generator

Jane PHP Logo

Jane allow to generate the perfect plain PHP Normalizer via a JSON Schema or AutoMapper!

https://jane.readthedocs.io/

Search results

Get back a DTO


$results = $index->search(
    new \Elastica\Query\Match('name', 'washwash')
);

Elastica return a Elastica\Result,
where "data" is an associative array.


Elastica\ResultSet:
    results: Elastica\Result[]

Elastica\ResultSet\BuilderInterface

Implement your own builder.


$result = new Result($hit);

$result->setModel(
    $this->serializer->denormalize(
        $result->getSource(),
        \User::class
    );
);

Custom Result Builder

Pass it to searches:


use \JoliCode\Elastically\ResultSetBuilder;

$search = $index->createSearch(
    $query,
    null,
    new ResultSetBuilder($serializer)
);

Introducing Elastically

PRO TIPS Β©

Index creation


PUT /app
{
    "settings": {
        "number_of_shards": 1,
        "analysis": {
            "analyzer": {
                "yolo": {
                    "tokenizer": "standard"
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "name": { "type": "text", "analyzer": "yolo" },
            "ref": { "type": "text", "analyzer": "yolo" }
        }
    }
}

↙️ ️Lots of repetition

YAML > JSON

analyzers.yaml (shared)

filter:
  app_french_stemmer:
    type: stemmer
    language: light_french
analyzer:
  app_french_heavy:
    tokenizer: icu_tokenizer
    filter:
      - app_french_elision
      - icu_folding
      - app_french_stemmer
mapping.yaml (per index)

settings:
  number_of_shards: 1
  # Include analyzers.yaml here
mappings:
  properties:
    name: &txt_basic
      type: text
      analyzer: app_standard
    ref: *txt_basic

https://noyaml.com/

Still easier to write than JSON

πŸŽ“ Protip Β© Index Version

  • Use index versioning thanks to aliases
    • app_2020-12-001
    • app_2020-12-002 with alias app_search
    • app_2020-12-003
  • PHP should only talk to aliases
  • Moving aliases is easy

πŸŽ“ Protip Β© Dynamic

Don't use dynamic mapping unless you like random results


PUT /app/_doc/1
{ "rating": 9 }

PUT /app/_doc/2
{ "rating": 9.9 }

GET /app/_mapping
> { "rating": { "type": "long" }}
            

Oops, 9.9 is stored as 9 with no warning!

πŸŽ“ Protip Β© Dynamic

Disable it and sleep better.


mappings:
    dynamic: false
    properties:
        rating:
            ...
            

Mapping β‰  Document

  • Mapping: fields for Lucene
  • Document / Source: your data
  • They are not strictly correlated.
{
  title: "SymfonyWorld Online 2020",
  url: "https://live.symfony.com/2020-world/"
}

We do not need to index "url".

Continuous data sync

Don't do it synchronously

  • Default refresh latency of 1 second
  • HTTP connection to open, JSON to encode
  • Slow down your application
  • If Elasticsearch is down = update loss
  • You better index asynchronously!

With Symfony Messenger

Minimal message, let the worker do the job


namespace JoliCode\Elastically\Messenger;

final class IndexationRequest
{
    private $operation; // "index" / "delete"
    private $type; // DTO FQN
    private $id; // Database ID

    public function __construct(string $type, string $id, string $operation = IndexationRequestHandler::OP_INDEX)
    {
        $this->type = $type;
        $this->id = $id;
        $this->operation = $operation;
    }

...

Symfony Messenger Handler


class IndexationRequestHandler implements MessageHandlerInterface
{
    public function __invoke(IndexationRequest $message)
    {
        $model = // todo fetch the model
        $doc = new Document($id, $model, '_doc')

        $indexer->scheduleIndex($indexName, $doc);
        $indexer->flush();
    }
}

Symfony Messenger

Send update request:


$this->bus->dispatch(
    new IndexationRequest(User::class, 123, 'index')
);

Read the Messages:

$ bin/console messenger:consume-messages

Symfony Messenger

  • Supported by
    • FOSElasticaBundle (since 6.0.0 Beta 2)
    • Elastically
  • Very little code, huge benefits

Symfony HttpClient

We can switch Elastica HTTP Client!

Elastica\Transport\AbstractTransport

Just one class to implement πŸŽ‰

Available in Elastically

Symfony HttpClient

Symfony HTTPClient in Elastica

Production tips

Client Node

Client Node Elasticsearch

Put a Node directly on your PHP servers,
get the power of localhost:9200!

No security by default

Exposed Elasticsearch Exposed Elasticsearch Exposed Elasticsearch Exposed Elasticsearch Exposed Elasticsearch

Monitor Elasticsearch

Datadog Elastic

Don't host Elasticsearch

SymfonyCloud


# .symfony/services.yaml
mysearch:
    type: "elasticsearch:7.2"
    disk: 1024
    configuration:
        plugins:
            - analysis-icu

# .symfony.cloud.yaml
relationships:
    elasticsearch: "mysearch:elasticsearch"

Kibana == PHPStorm

Kibana 7

Kibana tips

Open local Kibana to production Elasticsearch


$ symfony tunnel:open

$ docker run -it --rm --network host \
    -e ELASTICSEARCH_URL=http://127.0.0.1:30002 \
    docker.elastic.co/kibana/kibana:7.2.1

🌼 Index emoji 🌼

To conclude

😎 Ideal final stack 😎

  • YAML for mappings
  • DTO for data
  • HttpClient, Messenger, Serializer
  • Elastica (no array!)
  • Managed hosting

Thanks a lot

❓ Any questions ❓

https://github.com/jolicode/elastically

@damienalexandre
coucou@jolicode.com

CrΓ©dit photos

unsplash-logoPriscilla Du Preez unsplash-logoAaron Burden unsplash-logoSuzanne D. Williams unsplash-logoAnthony Martino
unsplash-logoAlexandre Godreau unsplash-logoDavid Clode unsplash-logoVincent Botta unsplash-logoBrooke Lark Winner ShutterStock

Thanks

JoliCode, Perrine, Nicolas Ruflin.

Symfony World