FreeBSD Manual Pages
MROONGA(1) Mroonga MROONGA(1) NAME mroonga - Mroonga Documentation For the release history: News THE CHARACTERISTICS OF MROONGA What is Mroonga? Mroonga is a MySQL storage engine based on Groonga, the full text search engine. In MySQL 5.1 or later, Pluggable Storage Engine interface is intro- duced, and we can use custom storage engines easily. So we implement Mroonga, so that we can use Groonga through MySQL. By using Mroonga, you can use Groonga with SQL. The successor of Tritonn To support Japanese full text search, Tritonn was developed by embed- ding Senna, the predecessor of Groonga, in MySQL. Mroogna is its suc- cessor. Running as a MySQL plugin Since Tritonn was the modified version of MySQL, we need to build it by ourselves or use binary files provided by Tritonn project, thus we can- not use the official binary files provided by MySQL. On the other hand, Mroonga is an independent program (shared library) using Pluggable Storage Engine interface, and we can dynamically load it on MySQL's official binary. So we can use it more easily than Tri- tonn. Faster index update Comparing to Senna, Groonga has much better throughput in adding or up- dating index. Mroonga also has the benefit of this performance improvement. Faster search In Tritonn, we use MyISAM storage engine, thus we have a exclusive ta- ble lock by updating data (and index), and it prevents the performance of search. But in Mroonga, we no longer have this issue, and the performance of search is better especially in frequent data update cases. Geolocation search Groonga supports not only the full text search, but also the fast ge- olocation search using index. And MySQL also has the syntax for geolo- cation search. With Mroonga, you can use Groonga's fast geolocation search by using MySQL's geolocation SQL syntax. Sharing the same Groonga storage Mroonga stores the data by using Groonga's DB API. And its storage file's format is same as that of the file that is managed by Groonga itself only. Therefore you can share the same Groonga storage like be- low. • Store data through Mroonga (MySQL) and search from Groonga server. • Store data through Groonga server and search from Mroonga (MySQL). And Groonga's storage file can be shared with multi-processes and multi-threads, so that we can invoke several search queries to the same storage file simultaneously. Associate with other storage engines Mroonga has two running modes. One is "storage mode", that is the default mode, and we use Groonga for both storing data and searching. With this mode, you can have full benefits of Groonga described above, like fast data update, lock-free full text search and geolocation search. But it does not support transactions. Another one is "wrapper mode", that adds full text search function on other storage engines like MyISAM or InnoDB. With this mode, you can use Groonga's fast full text search with having the benefits of the storage engine, ex. transaction in InnoDB. But you cannot have bene- fits from Groonga's read-lock free characteristic. And you might have the performance bottle neck in the storage engine in updating data. Supported platforms Mroonga supports many of the major platforms. Refer to Install for supported platforms. INSTALL This section describes how to install Mroonga on each environment. There are packages for major platforms. It's recommended that you use package instead of building Mroonga by yourself. But don't worry. There is a document about building Mroonga from source. We distribute both 32-bit and 64-bit packages but we strongly recommend a 64-bit package for server. You should use a 32-bit package just only for tests or development. You will encounter an out of memory error with a 32-bit package even if you just process medium size data. Windows This section describes how to install Mroonga on Windows. You can in- stall Mroogna by extracting a zip package. Mroonga binary for Windows is provided with MariaDB binary because some changes are needed for building Mroonga for Windows. Install Download and extract the zip file Download zip file and extract it. You need to choose a zip for your en- vironment: • MariaDB with Mroonga-14.13 Run mariadb-install-db.exe Run bin\mariadb-install-db.exe to initialize the database. mariadb-in- stall-db.exe is included in the zip file. > bin\mariadb-install-db.exe --datadir=C:\EXAMPLE\data --service=MariaDB --password=PASSWORD • --datadir=C:\EXAMPLE\data • Data directory of the new database • --service=MariaDB • Name of the Windows service • Specify if you want to register MariaDB as a Windows service • --password=PASSWORD • Password of the root user Start MariaDB server command Just start MariaDB server by following command. > bin\mariadbd.exe --defaults-file=.\MY-PREFERRED-INI.ini --console If MariaDB is registered as a Windows service, it will be started by the Windows service. Install Mroonga to MariaDB Next connect to MariaDB by following command. > bin\mariadb.exe MariaDB [(none)]> After connecting, execute share\mroonga\install.sql to install Mroonga. share\mroonga\install.sql is included in the zip file and should be specified with an appropriate path. MariaDB [(none)]> SOURCE C:PATHTO\share\mroonga\install.sql; Query OK, 0 rows affected (0.064 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.002 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.000 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.000 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.000 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected, 1 warning (0.000 sec) Query OK, 0 rows affected (0.000 sec) MariaDB [(none)]> SHOW ENGINES; +--------------------+---------+------------------------------------------------------------+--------------+------+------------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +--------------------+---------+------------------------------------------------------------+--------------+------+------------+ | CSV | YES | CSV storage engine | NO | NO | NO | | PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO | | MyISAM | YES | MyISAM storage engine | NO | NO | NO | | MRG_MyISAM | YES | Collection of identical MyISAM tables | NO | NO | NO | | InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES | | Mroonga | YES | CJK-ready fulltext search, column store | NO | NO | NO | | Aria | YES | Crash-safe tables with MyISAM heritage | NO | NO | NO | +--------------------+---------+------------------------------------------------------------+--------------+------+------------+ 8 rows in set (0.00 sec) Build from source with MariaDB You need to use Mroonga bundled MariaDB source provided by the Mroonga project. You can find it in https://packages.groonga.org/source/mroonga/. Mroonga bundled MariaDB source has mariadb-${MARIADB_VER- SION}-with-mroonga-${MROONGA_VERSION}.zip file name. You can build the source code with the standard MariaDB build process. You need to register Mroonga after building MariaDB. Use SQL at ${MARI- ADB_BUILD_DIR}\storage\mroonga\data\install.sql to register Mroonga. macOS This section describes how to install Mroonga on macOS. You can install Mroonga by Homebrew. Homebrew See mroonga/homebrew-mroonga on GitHub for details. Debian GNU/Linux This section describes how to install Mroonga related deb packages on Debian GNU/Linux. You can install them by apt. bookworm (MariaDB) Install: % sudo apt update % sudo apt install -y -V apt-transport-https % sudo apt install -y -V wget % wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb % sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb % wget https://packages.groonga.org/debian/groonga-apt-source-latest-bookworm.deb % sudo apt install -y -V ./groonga-apt-source-latest-bookworm.deb % sudo apt update % sudo apt install -y -V mariadb-10.11-mroonga If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo apt install -y -V groonga-tokenizer-mecab bookworm (with the Oracle MySQL 8.0 package) Install: % sudo apt update % sudo apt install -y -V apt-transport-https % sudo apt install -y -V wget % wget https://packages.groonga.org/debian/groonga-apt-source-latest-bookworm.deb % wget https://repo.mysql.com/mysql-apt-config.deb % sudo apt install -y -V ./groonga-apt-source-latest-bookworm.deb % sudo env DEBIAN_FRONTEND=noninteractive MYSQL_SERVER_VERSION=mysql-8.0 apt install -y ./mysql-apt-config.deb % sudo apt update % sudo apt install -y -V mysql-community-8.0-mroonga If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo apt install -y -V groonga-tokenizer-mecab Ubuntu This section describes how to install Mroonga related deb packages on Ubuntu. You can install them by apt. PPA (Personal Package Archive) The Mroonga APT repository for Ubuntu uses PPA (Personal Package Archive) on Launchpad. You can install Mroonga by APT from the PPA. Here are supported Ubuntu versions: • 20.04 Focal Fossa • 22.04 Jammy Jellyfish Here are Ubuntu versions that supports MySQL: • 20.04 Focal Fossa • 22.04 Jammy Jellyfish Here are Ubuntu versions that supports MariaDB: • 20.04 Focal Fossa • 22.04 Jammy Jellyfish Enable the universe repository and the security update repository to install Mroonga: % sudo apt-get install -y -V software-properties-common lsb-release % sudo add-apt-repository -y universe % sudo add-apt-repository "deb http://security.ubuntu.com/ubuntu $(lsb_release --short --codename)-security main restricted" Add the ppa:groonga/ppa PPA to your system: % sudo add-apt-repository -y ppa:groonga/ppa % sudo apt-get update Install Mroonga for MySQL: % sudo apt-get install -y -V mysql-server-mroonga Install Mroonga for MariaDB: % sudo apt-get install -y -V mariadb-server-mroonga If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo apt-get install -y -V groonga-tokenizer-mecab AlmaLinux This section describes how to install Mroonga related RPM packages on AlmaLinux. You can install them by dnf. AlmaLinux 8 (with the Oracle MySQL 8.0 package) You can use Oracle's MySQL packages version 8.0 on AlmaLinux 8 since Mroonga 11.10 release. NOTE: There are already known issues about MySQL 8.0. • Wrapper mode Wrapper mode is not supported yet • Storage mode Storage mode does not support the following feature. • The feature of relevant to the optimization. Install: % sudo dnf -y module disable mysql % sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm % sudo dnf install -y https://repo.mysql.com/mysql80-community-release-el8.rpm % sudo dnf install --disablerepo=AppStream -y --enablerepo=epel,powertools mysql-community-8.0-mroonga (% sudo systemctl start mysqld) (% tmp_password=$(sudo grep 'A temporary password' /var/log/mysqld.log | sed -e 's/^.*: //')) (% sudo mysqladmin -u root --password="${tmp_password}" password) If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf -y module enable mysql % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab % sudo dnf -y module disable mysql AlmaLinux 8 (with Percona Server 8.0 package) You can use Percona Server packages version 8.0 on AlmaLinux 8 since Mroonga 11.10 release. Install: % sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm % sudo dnf install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm % sudo percona-release setup ps80 % sudo dnf install -y --enablerepo=epel,powertools percona-server-8.0-mroonga (% sudo systemctl start mysqld) (% tmp_password=$(sudo grep 'A temporary password' /var/log/mysqld.log | sed -e 's/^.*: //')) (% sudo mysqladmin -u root --password="${tmp_password}" password) If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab AlmaLinux 8 (with MariaDB 10.5 package) You can use MariaDB's MariaDB packages version 10.5 on AlmaLinux 8 since Mroonga 11.10 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/10.5/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: % sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm % sudo dnf module -y disable mariadb % sudo dnf module -y disable mysql % sudo dnf install -y --enablerepo=powertools mariadb-server % sudo systemctl start mariadb % sudo dnf install -y --enablerepo=powertools mariadb-10.5-mroonga (% sudo mysqladmin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf module -y enable mysql % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab % sudo dnf module -y disable mysql AlmaLinux 8 (with MariaDB 10.6 package) You can use MariaDB's MariaDB packages version 10.6 on AlmaLinux 8 since Mroonga 11.10 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/10.6/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: % sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm % sudo dnf module -y disable mariadb % sudo dnf module -y disable mysql % sudo dnf install -y --enablerepo=powertools mariadb-server % sudo systemctl start mariadb % sudo dnf install -y --enablerepo=powertools mariadb-10.6-mroonga (% sudo mysqladmin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf module -y enable mysql % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab % sudo dnf module -y disable mysql AlmaLinux 8 (with MariaDB 10.11 package) You can use MariaDB's MariaDB packages version 10.11 on AlmaLinux 8 since Mroonga 13.01 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/10.11/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: % sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm % sudo dnf module -y disable mariadb % sudo dnf module -y disable mysql % sudo dnf install -y --enablerepo=powertools mariadb-server % sudo systemctl start mariadb % sudo dnf install -y --enablerepo=powertools mariadb-10.11-mroonga (% sudo mysqladmin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf module -y enable mysql % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab % sudo dnf module -y disable mysql AlmaLinux 8 (with MariaDB 11.4 package) You can use MariaDB's MariaDB packages version 11.4 on AlmaLinux 8 since Mroonga 14.07 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/11.4/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: $ sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm $ sudo dnf module -y disable mariadb $ sudo dnf module -y disable mysql $ sudo dnf install -y --enablerepo=powertools mariadb-server $ sudo systemctl start mariadb $ sudo dnf install -y --enablerepo=powertools mariadb-11.4-mroonga ($ sudo mariadb-admin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: $ sudo dnf module -y enable mysql $ sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab $ sudo dnf module -y disable mysql AlmaLinux 9 (with the Oracle MySQL 8.0 package) You can use Oracle's MySQL packages version 8.0 on AlmaLinux 9 since Mroonga 12.12 release. NOTE: There are already known issues about MySQL 8.0. • Wrapper mode Wrapper mode is not supported yet • Storage mode Storage mode does not support the following feature. • The feature of relevant to the optimization. Install: % sudo dnf install -y https://apache.jfrog.io/artifactory/arrow/almalinux/9/apache-arrow-release-latest.rpm % sudo dnf install -y https://packages.groonga.org/almalinux/9/groonga-release-latest.noarch.rpm % sudo dnf install -y https://repo.mysql.com/mysql80-community-release-el9.rpm % sudo dnf install --disablerepo=AppStream -y --enablerepo=epel,crb mysql-community-8.0-mroonga (% sudo systemctl start mysqld) (% tmp_password=$(sudo grep 'A temporary password' /var/log/mysqld.log | sed -e 's/^.*: //')) (% sudo mysqladmin -u root --password="${tmp_password}" password) If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab AlmaLinux 9 (with Percona Server 8.0 package) You can use Percona Server packages version 8.0 on AlmaLinux 9 since Mroonga 12.12 release. Install: % sudo dnf install -y https://apache.jfrog.io/artifactory/arrow/almalinux/9/apache-arrow-release-latest.rpm % sudo dnf install -y https://packages.groonga.org/almalinux/9/groonga-release-latest.noarch.rpm % sudo dnf install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm % sudo percona-release setup ps80 % sudo dnf install -y --enablerepo=epel percona-server-8.0-mroonga (% sudo systemctl start mysqld) (% tmp_password=$(sudo grep 'A temporary password' /var/log/mysqld.log | sed -e 's/^.*: //')) (% sudo mysqladmin -u root --password="${tmp_password}" password) If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab AlmaLinux 9 (with MariaDB 10.5 package) You can use MariaDB's MariaDB packages version 10.5 on AlmaLinux 9 since Mroonga 12.12 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/10.5/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: % sudo dnf install -y https://apache.jfrog.io/artifactory/arrow/almalinux/9/apache-arrow-release-latest.rpm % sudo dnf install -y https://packages.groonga.org/almalinux/9/groonga-release-latest.noarch.rpm % sudo dnf install -y mariadb-server % sudo systemctl start mariadb % sudo dnf install -y mariadb-10.5-mroonga (% sudo mysqladmin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab AlmaLinux 9 (with MariaDB 10.6 package) You can use MariaDB's MariaDB packages version 10.6 on AlmaLinux 9 since Mroonga 12.12 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/10.6/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: % sudo dnf install -y https://apache.jfrog.io/artifactory/arrow/almalinux/9/apache-arrow-release-latest.rpm % sudo dnf install -y https://packages.groonga.org/almalinux/9/groonga-release-latest.noarch.rpm % sudo dnf install -y mariadb-server % sudo systemctl start mariadb % sudo dnf install -y mariadb-10.6-mroonga (% sudo mysqladmin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab AlmaLinux 9 (with MariaDB 10.11 package) You can use MariaDB's MariaDB packages version 10.11 on AlmaLinux 9 since Mroonga 13.01 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/10.11/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: % sudo dnf install -y https://apache.jfrog.io/artifactory/arrow/almalinux/9/apache-arrow-release-latest.rpm % sudo dnf install -y https://packages.groonga.org/almalinux/9/groonga-release-latest.noarch.rpm % sudo dnf install -y mariadb-server % sudo systemctl start mariadb % sudo dnf install -y mariadb-10.11-mroonga (% sudo mysqladmin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab AlmaLinux 9 (with MariaDB 11.4 package) You can use MariaDB's MariaDB packages version 11.4 on AlmaLinux 9 since Mroonga 14.07 release. Create /etc/yum.repos.d/MariaDB.repo with the following content: [mariadb] name = MariaDB baseurl = https://rpm.mariadb.org/11.4/rhel/$releasever/$basearch gpgkey = https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck = 1 Install: $ sudo dnf install -y https://apache.jfrog.io/artifactory/arrow/almalinux/9/apache-arrow-release-latest.rpm $ sudo dnf install -y https://packages.groonga.org/almalinux/9/groonga-release-latest.noarch.rpm $ sudo dnf install -y mariadb-server $ sudo systemctl start mariadb $ sudo dnf install -y mariadb-11.4-mroonga ($ sudo mariadb-admin -u root password 'new-password') If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: $ sudo dnf install -y --enablerepo=epel groonga-tokenizer-mecab Oracle Linux This section describes how to install Mroonga related RPM packages on AlmaLinux. You can install them by dnf. Oracle Linux 8 (with the Oracle MySQL 8.0 package) You can use Oracle's MySQL packages version 8.0 on Oracle Linux 8 since Mroonga 12.08 release. NOTE: There are already known issues about MySQL 8.0. • Wrapper mode Wrapper mode is not supported yet • Storage mode Storage mode does not support the following feature. • The feature of relevant to the optimization. Install: % sudo dnf -y module disable mysql % sudo dnf install -y https://packages.groonga.org/almalinux/8/groonga-release-latest.noarch.rpm % sudo dnf install -y https://repo.mysql.com/mysql80-community-release-el8.rpm % sudo dnf install --disablerepo=AppStream -y --enablerepo=ol8_codeready_builder mysql-community-8.0-mroonga (% sudo systemctl start mysqld) (% tmp_password=$(sudo grep 'A temporary password' /var/log/mysqld.log | sed -e 's/^.*: //')) (% sudo mysqladmin -u root --password="${tmp_password}" password) If you want to use MeCab as a tokenizer, install groonga-tok- enizer-mecab package. Install groonga-tokenizer-mecab package: % sudo dnf install -y --enablerepo=ol8_codeready_builder groonga-tokenizer-mecab Fedora Fedora 21 MariaDB 10.0.x is adopted on Fedora 21. As Mroonga is now bundled with MariaDB since 10.0.15, you can use bundled version of Mroonga. Note that MariaDB bundled Mroonga version is a bit old. Docker Mroonga is also available via Docker Hub. See Quick Start guide to start. Others This section describes how to install Mroonga from source code. If there is no package no your environment, you need to build Mroonga from source code. Dependencies Mroonga needs some tools, libraries and MySQL for build. You can use MariaDB instead of MySQL. Tools Here are required tools. • wget, curl or Web browser for downloading source archive • tar and gzip for extracting source archive • shell (many shells such as dash, bash and zsh will work) • C compiler and C++ compiler (gcc and g++ are supported but other com- pilers may work) • CMake as a cross-platform build system generator • Ninja as a small build system with a focus on speed • pkg-config for detecting libraries You must get them ready. Here are optional tools. • sudo for installing built Mroonga Libraries Here are required libraries. • Groonga • If you use package, install development package such as lib- groonga-dev for deb or groonga-devel for RPM • groonga-normalizer-mysql Here are optional libraries. • MeCab: Japanese morphological analysis system NOTE: If you want to use indexes of tokenizing of each morpheme for full text search, install MeCab before installing Groonga. MySQL Mroonga needs not only installed MySQL but also MySQL source and build directory. You can't use MySQL package. It doesn't provide MySQL source and build directory. You need MySQL source and build directory! If you use MariaDB instead of MySQL, you need MariaDB source. Download the latest MySQL 8.4 source code, then build and install it. See also Download MySQL Community Server Here we assume that you use mysql-8.4.1 and its source code is ex- tracted in the following directory. ${HOME}/local/src/mysql-8.4.1 Then build in the following directory. ${HOME}/local/build/mysql-8.4.1 Here are command lines to build and install MySQL. $ cmake \ -S ${HOME}/local/src/mysql-8.4.1 \ -B ${HOME}/local/build/mysql-8.4.1 \ -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=${HOME}/local $ cmake --build ${HOME}/local/build/mysql-8.4.1 $ cmake --install ${HOME}/local/build/mysql-8.4.1 You need to run MySQL before you install Mroonga. Because you need to run some SQL statements to register Mroonga. MariaDB You can use MariaDB instead of MySQL. Note that you need to remove storage/mroonga/ (Mroonga bundled in Mari- aDB) before you build MariaDB. $ mkdir -p ${HOME}/local/src $ cd ${HOME}/local/src $ wget https://downloads.mariadb.org/rest-api/mariadb/11.4.3/mariadb-11.4.3.tar.gz $ tar xf mariadb-11.4.3.tar.gz $ rm -rf mariadb-11.4.3/storage/mroonga $ cd - $ cmake \ -S ${HOME}/local/src/mariadb-11.4.3 \ -B ${HOME}/local/build/mariadb-11.4.3 \ -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=${HOME}/local $ cmake --build ${HOME}/local/build/mariadb-11.4.3 $ cmake --install ${HOME}/local/build/mariadb-11.4.3 Build from source Mroonga uses CMake. So the following is the simplest build steps. $ cd ${HOME}/local/src $ wget https://packages.groonga.org/source/mroonga/mroonga-latest.tar.gz $ tar xvf mroonga-latest.tar.gz $ mroonga_base_name=$(find mroonga-* -maxdepth 0 -type d) $ cmake \ -S ${HOME}/local/src/${mroonga_base_name} \ -B ${HOME}/local/build/${mroonga_base_name} \ -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=${HOME}/local \ -DMYSQL_BUILD_DIR=${HOME}/local/build/mysql-8.4.1 \ -DMYSQL_CONFIG=${HOME}/local/bin/mysql_config \ -DMYSQL_SOURCE_DIR=${HOME}/local/src/mysql-8.4.1 $ cmake --build ${HOME}/local/build/${mroonga_base_name} $ cmake --install ${HOME}/local/build/${mroonga_base_name} $ ${HOME}/local/bin/mysql -u root < ${HOME}/local/share/mroonga/install.sql You need to specify the following on cmake. • -DMYSQL_BUILD_DIR: The location of MySQL build directory • -DMYSQL_CONFIG: The path of mysql_config command • -DMYSQL_SOURCE_DIR: The location of MySQL source code You can confirm Mroonga is installed successfully by SHOW ENGINES SQL. If you can find Mroonga row, Mroonga is installed successfully. mysql> SHOW ENGINES; +------------+---------+------------------------------------------------------------+--------------+------+------------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +------------+---------+------------------------------------------------------------+--------------+------+------------+ | Mroonga | YES | Fulltext search, column base | NO | NO | NO | | MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO | | CSV | YES | CSV storage engine | NO | NO | NO | | MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great performance | NO | NO | NO | | InnoDB | YES | Supports transactions, row-level locking, and foreign keys | YES | YES | YES | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO | +------------+---------+------------------------------------------------------------+--------------+------+------------+ 6 rows in set (0.00 sec) The following describes details about each step. cmake First, you need to run cmake. Here are important cmake parameters. -DMYSQL_SOURCE_DIR=PATH Specifies the location of MySQL source code. This is required parameter. -DMYSQL_BUILD_DIR=PATH Specifies the location where you build MySQL source code. If you build MySQL in MySQL source code directory, you don't need to specify this parameter. If you build MySQL in other directory, you need to specify this parameter. -DMYSQL_CONFIG=PATH Specifies the path of mysql_config command. If mysql_config command can be found by PATH, you don't need to specify this parameter. For example, if mysql_config command exists at /usr/bin/mysql_config, you don't need to specify this parameter. -DMRN_DEFAULT_TOKENIZER=TOKENIZER Specifies the default tokenizer for full text. You can custom it in my.cnf. The default is TokenBigram. Here is an example to use TokenMecab as the default tokenizer: $ cmake ... -DMRN_DEFAULT_TOKENIZER=TokenMecab -DCMAKE_BUILD_TYPE={Release,Debug,RelWithDebInfo} Specifies the build type. If you use this Mroonga as a production environment, you must use -DCAMKE_BUILD_TYPE=Release or -DCAMKE_BUILD_TYPE=RelWithDebInfo. They enable optimization. If you use this Mroonga for development, you must use -DCAMKE_BUILD_TYPE=Debug. It disables optimization and enables debug symbols. They are useful for development. -DCMAKE_INSTALL_PREFIX=PREFIX Specifies the install base directory. Mroonga related files are in- stalled under ${PREFIX}/ directory except ha_mroonga.so. ha_mroonga.so is a MySQL plugin file. It is installed the plugin directory of MySQL. The default is /usr/local. In this case, install.sql that is used for installing Mroonga is installed to /usr/local/share/mroonga/in- stall.sql. cmake --build If cmake is succeeded, you can build Mroonga by cmake --build. If you get some errors by cmake --build, please report them to us: How to report a bug cmake --install Now, you can install built Mroonga! If you don't have write permission for ${PREFIX} and the plugin direc- tory of MySQL, you need to use sudo: $ sudo cmake --install ${HOME}/local/build/${mroonga_base_name} mysql -u root < install.sql You need to run some SQL statements to register Mroonga to MySQL such as INSTALL PLUGIN and CREATE FUNCTION. They are written in ${PRE- FIX}/share/mroonga/install.sql. Uninstall Mroonga If you want to uninstall Mroonga, use the following command lines: $ ${HOME}/local/bin/mysql -u root < ${PREFIX}/share/mroonga/uninstall.sql $ xargs rm < ${HOME}/local/build/mroonga/install_manifest.txt UPGRADE There is a case that incompatible change is introduced at new release. It is announced by release announce if new release contains such a in- compatible change. Here is the list of recommended way of upgrading Mroonga from old re- lease. See following URL about upgrade sequence if you use previous version. If you upgrade prior to 1.20, refer to Release 1.20 - 2012/01/29 If you upgrade from 1.20, refer to Release 2.00 - 2012/02/29 If you upgrade from 2.00 or 2.01, refer to Release 2.02 - 2012/04/29 If you upgrade from 2.00 or later and using multiple column indexes on storage mode, refer to Release 2.03 - 2012/05/29 If you upgrade from 2.04 or later and using SET column or ENUM that has the number of elements < 256 in Storage mode, refer to Release 2.05 - 2012/07/29 If you upgrade from 2.05 or later and using multiple column indexes against VARCHAR or CHAR, refer to Release 2.08 - 2012/10/29 If you upgrade from 2.08 or later and using TIMESTAMP column, please recreate database. If you upgrade from 2.08 or later and using CHAR(N) as primary key, please recreate index. Refer to Release 2.09 - 2012/11/29 for each case. If you upgrade prior to 5.03 and satisfies following the conditions, refer to Release 5.04 - 2015/06/29 and upgrade schema. • Using custom tokenizer in index comment • Using mroonga_default_parser as server variable • Using index_flags parameter for index column • Using type parameter for Groonga's column type in storage mode TUTORIAL If you don't install Mroonga yet, see Install. It describes how to in- stall Mroonga. Try the following tutorials after you install Mroonga. You will under- stand how to use Mroonga. Installation check It is better that you check Mroonga installation before you use Mroonga. If Mroonga installation is failed, the SQLs in this tutorial will fail. The way to start or stop MySQL server is just same as the normal MySQL. After invoking the MySQL server, connect to it by mysql command. If you set password, you need to add '-p' option. % mysql -uroot test By using SHOW ENGINES command, you can check if Mroonga is installed: SHOW ENGINES; -- +------------+---------+------------------------------------------------------------+--------------+------+------------+ -- | Engine | Support | Comment | Transactions | XA | Savepoints | -- +------------+---------+------------------------------------------------------------+--------------+------+------------+ -- | Mroonga | YES | Fulltext search, column base | NO | NO | NO | -- | MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO | -- | CSV | YES | CSV storage engine | NO | NO | NO | -- | MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great performance | NO | NO | NO | -- | InnoDB | YES | Supports transactions, row-level locking, and foreign keys | YES | YES | YES | -- | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO | -- +------------+---------+------------------------------------------------------------+--------------+------+------------+ -- 6 rows in set (0.00 sec) If you see Mroonga storage engine like the above, the installation is well done. If Mroonga isn't shown, run the following command. If you don't install Mroonga by package, the path of install.sql may be different: % mysql -uroot test < /usr/share/mroonga/install.sql If you can't find install.sql, run INSTALL PLUGIN command like the fol- lowing manually: INSTALL PLUGIN Mroonga SONAME 'ha_mroonga.so'; Then, check server variable whether correct Mroonga version is in- stalled by SHOW VARIABLES command. SHOW VARIABLES LIKE 'mroonga_version'; Mode Mroonga has the following two modes. • storage mode • wrapper mode With the storage mode, we use Groonga for both of the full text search function and the data storage. Since all functions of storage engine are realised with Groonga, aggregations are fast, that is one of Groonga's advantages, and you can manage the database directly by groonga command. The structure of the storage mode is the following. You use it instead of existing storage engines like MyISAM or InnoDB [image: storage mode] [image] With the wrapper mode, Groonga is used for full text search function only, and another existing storage engine like InnoDB is used for stor- ing data. By using wrapper mode, you combine InnoDB that is well-use as the storage engine and Mroonga that is a proven full text search en- gine, and you can use it as the stable database having the fast full text search function. The structure of the wrapper mode is the following. Full text search related operations are done by Mroonga, and other operations are done by existing storage engines like MyISAM, InnoDB etc. Mroonga is lo- cated between SQL Handler that processes SQL and an existing storage engine, thus all data goes through Mroonga With this way, full text search indexing etc. are done transparently. [image: wrapper mode] [image] Storage mode Here we explain how to use storage mode of Mroonga How to use full text search After confirming the installation, let's create a table. The important point is to specify Mroonga by ENGINE = Mroonga: CREATE TABLE diaries ( id INT PRIMARY KEY AUTO_INCREMENT, content VARCHAR(255), FULLTEXT INDEX (content) ) ENGINE = Mroonga DEFAULT CHARSET utf8; -- Query OK, 0 rows affected (0.10 sec) We put data by INSERT: INSERT INTO diaries (content) VALUES ("It'll be fine tomorrow."); -- Query OK, 1 row affected (0.01 sec) INSERT INTO diaries (content) VALUES ("It'll rain tomorrow"); -- Query OK, 1 row affected (0.00 sec) Try full text search: SELECT * FROM diaries WHERE MATCH(content) AGAINST("+fine" IN BOOLEAN MODE); -- +----+-----------------------------------------+ -- | id | content | -- +----+-----------------------------------------+ -- | 1 | It'll be fine tomorrow. | -- +----+-----------------------------------------+ -- 1 row in set (0.00 sec) Yes, full text search works. How to get search score NOTE: In version 1.0.0 or before, Mroonga used a special column named _score to get search score. From version 1.0.0, it follows MySQL's standard way to get search score. We often want to display more relevant results first in full text search. We use search score in such case. We can get search score by MySQL's standard way [1], i.e. we use MATCH...AGAINST in one of columns in SELECT or ORDER BY. Let's try: INSERT INTO diaries (content) VALUES ("It's fine today. It'll be fine tomorrow as well."); -- Query OK, 1 row affected (0.00 sec) INSERT INTO diaries (content) VALUES ("It's fine today. But it'll rain tomorrow."); -- Query OK, 1 row affected (0.00 sec) SELECT *, MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) FROM diaries WHERE MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) ORDER BY MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) DESC; -- +----+--------------------------------------------------+---------------------------------------------------+ -- | id | content | MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) | -- +----+--------------------------------------------------+---------------------------------------------------+ -- | 3 | It's fine today. It'll be fine tomorrow as well. | 2 | -- | 1 | It'll be fine tomorrow. | 1 | -- | 4 | It's fine today. But it'll rain tomorrow. | 1 | -- +----+--------------------------------------------------+---------------------------------------------------+ -- 3 rows in set (0.00 sec) The result having the search word fine more, i.e. id = 3 message having the higher search score, is displayed first. And you also get search score by using MATCH AGAINST in SELECT phrase. You can use AS to change the attribute name: SELECT *, MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) AS score FROM diaries WHERE MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) ORDER BY MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) DESC; -- +----+--------------------------------------------------+-------+ -- | id | content | score | -- +----+--------------------------------------------------+-------+ -- | 3 | It's fine today. It'll be fine tomorrow as well. | 2 | -- | 1 | It'll be fine tomorrow. | 1 | -- | 4 | It's fine today. But it'll rain tomorrow. | 1 | -- +----+--------------------------------------------------+-------+ -- 3 rows in set (0.00 sec) How to specify the parser for full text search MySQL has the following syntax to specify the parser [2] for full text search: FULLTEXT INDEX (content) WITH PARSER parser_name To use this syntax, you need to register all parsers in MySQL before- hand. On the other hand, Groonga can dynamically add a tokenizer, that is a parser in MySQL. So if use this syntax in Mroonga, tokenizers that are added in Groonga dynamically cannot be supported. We think that this limitation decreases the convenience, and we choose our own syntax using COMMENT like the following: FULLTEXT INDEX (content) COMMENT 'tokenizer "TokenMecab"' NOTE: COMMENT in FULLTEXT INDEX is only supported MySQL 5.5 or later. If you use MySQL 5.1, use mroonga_default_tokenizer variable described below. You can specify one of the following values as the tokenizer. "tok- enizer" in Mroonga equals to "parser" in MySQL. Available tokenizers +----------------------------+----------------------------+ | Tokenizer | Description | +----------------------------+----------------------------+ | off | (Deprecated) It does not | | | tokenize at all. Use off | | | if you want to treat con- | | | tent as is. For example, | | | this value is used for | | | prefix search. | +----------------------------+----------------------------+ | none | Added in version 8.07: It | | | does not tokenize at all. | | | Use none if you want to | | | treat content as is. For | | | example, this value is | | | used for prefix search. | +----------------------------+----------------------------+ | TokenBigram | It tokenises in bigram. | | | But continuous alphabets, | | | numbers or symbols are | | | treated as a token. So | | | there can exist tokes with | | | 3 letters or more. It is | | | to reduce noises. | | | | | | This is the default value. | +----------------------------+----------------------------+ | TokenMecab | It tokenises using MeCab. | | | Groonga should be built | | | with MeCab support. | +----------------------------+----------------------------+ | TokenBigramSplitSymbol | It tokenises in bigram. | | | Unlike TokenBigram, con- | | | tinuous symbols are not | | | treated as a token, but | | | tokenised in bigram. | | | | | | When you use TokenBigram- | | | SplitSymbol instead of To- | | | kenBigram, "!?" can match | | | "!?!?!?" in "Is it re- | | | ally!?!?!?". But when you | | | use TokenBigram, only | | | "!?!?!?" can match as | | | well. | +----------------------------+----------------------------+ | TokenBigramSplitSymbolAl- | It tokenise in bigram. In | | pha | addition to TokenBigram- | | | SplitSymbol, continuous | | | alphabets are not treated | | | as a token either, but to- | | | kenised in bigram. | | | | | | When you use TokenBigram- | | | SplitSymbolAlpha instead | | | of TokenBigram, "real" can | | | match "Is it really?". But | | | when you use TokenBigram, | | | only "really" can match as | | | well. | +----------------------------+----------------------------+ | TokenBigramSplitSymbolAl- | It tokenise in bigram. In | | phaDigit | addition to TokenBigram- | | | SplitSymbolAlpha, continu- | | | ous numbers are not | | | treated as a token either, | | | but tokenised in bigram. | | | So any kind of characters | | | are treated equally in bi- | | | gram. | | | | | | When you use TokenBigram- | | | SplitSymbolAlphaDigit in- | | | stead of TokenBigram, | | | "567" can match | | | "090-0123-4567". But when | | | you use TokenBigram, only | | | "4567" can match as well. | +----------------------------+----------------------------+ | TokenBigramIgnoreBlank | It tokenise in bigram. Un- | | | like TokenBigram, it ig- | | | nores white spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlank instead of | | | TokenBigram, "" can match | | | " ". But when you use | | | TokenBigram, only " " | | | can match as well. | +----------------------------+----------------------------+ | TokenBigramIgnore- | It tokenise in bigram. Un- | | BlankSplitSymbol | like TokenBigramSplitSym- | | | bol, it ignores white | | | spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlankSplitSymbol | | | instead of TokenBigram- | | | SplitSymbol, "???" can | | | match "! ? ???". But when | | | you use TokenBigramSplit- | | | Symbol, only "? ??" can | | | match as well. | +----------------------------+----------------------------+ | TokenBigramIgnore- | It tokenise in bigram. Un- | | BlankSplitSymbolAlpha | like TokenBigramSplitSym- | | | bolAlpha, it ignores white | | | spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlankSplitSymbo- | | | lAlpha instead of TokenBi- | | | gramSplitSymbolAlpha, | | | "ama" can match "I am a | | | pen.". But when you use | | | TokenBigramSplitSymbolAl- | | | pha, only "am a" can match | | | as well. | +----------------------------+----------------------------+ | TokenBigramIgnore- | It tokenise in bigram. Un- | | BlankSplitSymbolAlphaDigit | like TokenBigramSplitSym- | | | bolAlphaDigit, it ignores | | | white spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlankSplitSymbo- | | | lAlphaDigit instead of To- | | | kenBigramSplitSymbolAl- | | | phaDigit, "9001" can match | | | "090 0123 4567". But when | | | you use TokenBigramSplit- | | | SymbolAlphaDigit, only "90 | | | 01" can match as well. | +----------------------------+----------------------------+ | TokenDelimit | It tokenise by splitting | | | with a white space. | | | | | | "movie horror topic" will | | | be tokenised as "movie", | | | "horror", "topic". | +----------------------------+----------------------------+ | TokenDelimitNull | It tokenise by splitting | | | with a null character | | | (\\0). | | | | | | "movie\\0horror\\0topic" | | | will be tokenised as | | | "movie", "horror", | | | "topic". | +----------------------------+----------------------------+ | TokenUnigram | It tokenises in unigram. | | | But continuous alphabets, | | | numbers or symbols are | | | treated as a token. So | | | there can exist tokes with | | | 2 letters or more. It is | | | to reduce noises. | +----------------------------+----------------------------+ | TokenTrigram | It tokenises in trigram. | | | But continuous alphabets, | | | numbers or symbols are | | | treated as a token. So | | | there can exist tokes with | | | 4 letters or more. It is | | | to reduce noises. | +----------------------------+----------------------------+ You can specify the default parser by passing --with-default-tokenizer option in configure when you build Mroonga: ./configure --with-default-tokenizer TokenMecab ... Or you can set mroonga_default_tokenizer variable in my.cnf or by SQL. If you specify it in my.cnf, the change will not be lost after restart- ing MySQL, but you need to restart MySQL to make it effective. On the other hand, if you set it in SQL, the change is effective immediately, but it will be lost when you restart MySQL. my.cnf: [mysqld] mroonga_default_tokenizer=TokenMecab SQL: SET GLOBAL mroonga_default_tokenizer = TokenMecab; -- Query OK, 0 rows affected (0.00 sec) How to specify the normalizer Mroonga uses normalizer corresponding to the encoding of document. It is used when tokenizing text and storing table key. It is used NormalizerMySQLGeneralCI normalizer when the encoding is utf8_general_ci or utf8mb4_general_ci. It is used NormalizerMySQLUnicodeCI normalizer when the encoding is utf8_unicode_ci or utf8mb4_unicode_ci. It isn't used normalizer when the encoding is utf8_bin. Here is an example that uses NormalizerMySQLUnicodeCI normalizer by specifying utf8_unicode_ci: SET NAMES utf8; -- Query OK, 0 rows affected (0.00 sec) CREATE TABLE diaries ( day DATE PRIMARY KEY, content VARCHAR(64) NOT NULL, FULLTEXT INDEX (content) ) Engine=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; -- Query OK, 0 rows affected (0.18 sec) INSERT INTO diaries VALUES ("2013-04-23", " "); -- Query OK, 1 row affected (0.00 sec) SELECT * FROM diaries WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); -- +------------+-----------------------------------------+ -- | day | content | -- +------------+-----------------------------------------+ -- | 2013-04-23 | | -- +------------+-----------------------------------------+ -- 1 row in set (0.00 sec) SELECT * FROM diaries WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); -- +------------+-----------------------------------------+ -- | day | content | -- +------------+-----------------------------------------+ -- | 2013-04-23 | | -- +------------+-----------------------------------------+ -- 1 row in set (0.00 sec) Mroonga has the following syntax to specify Groonga's normalizer: FULLTEXT INDEX (content) COMMENT 'normalizer "NormalizerAuto"' See Groonga's documentation about normalizer for more details. Here is an example that uses NormalizerAuto normalizer: SET NAMES utf8; -- Query OK, 0 rows affected (0.00 sec) CREATE TABLE diaries ( day DATE PRIMARY KEY, content VARCHAR(64) NOT NULL, FULLTEXT INDEX (content) COMMENT 'normalizer "NormalizerAuto"' ) Engine=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; -- Query OK, 0 rows affected (0.19 sec) INSERT INTO diaries VALUES ("2013-04-23", " "); -- Query OK, 1 row affected (0.00 sec) SELECT * FROM diaries WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); -- Empty set (0.00 sec) SELECT * FROM diaries WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); -- +------------+-----------------------------------------+ -- | day | content | -- +------------+-----------------------------------------+ -- | 2013-04-23 | | -- +------------+-----------------------------------------+ -- 1 row in set (0.00 sec) How to specify the token filters Mroonga has the following syntax to specify Groonga's token filters.: FULLTEXT INDEX (content) COMMENT 'token_filters "TokenFilterStem"' Here is an example that uses TokenFilterStem token filter: SELECT mroonga_command('register token_filters/stem'); -- +------------------------------------------------+ -- | mroonga_command('register token_filters/stem') | -- +------------------------------------------------+ -- | true | -- +------------------------------------------------+ -- 1 row in set (0.00 sec) CREATE TABLE memos ( id INT NOT NULL PRIMARY KEY, content TEXT NOT NULL, FULLTEXT INDEX (content) COMMENT 'normalizer "NormalizerAuto", token_filters "TokenFilterStem"' ) Engine=Mroonga DEFAULT CHARSET=utf8; -- Query OK, 0 rows affected (0.18 sec) INSERT INTO memos VALUES (1, "I develop Groonga"); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (2, "I'm developing Groonga"); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (3, "I developed Groonga"); -- Query OK, 1 row affected (0.00 sec) SELECT * FROM memos WHERE MATCH (content) AGAINST ("+develops" IN BOOLEAN MODE); -- +----+------------------------+ -- | id | content | -- +----+------------------------+ -- | 1 | I develop Groonga | -- | 2 | I'm developing Groonga | -- | 3 | I developed Groonga | -- +----+------------------------+ -- 3 rows in set (0.01 sec) See Groonga's documentation about token filter documentation for more details. Here is an example that uses TokenFilterStopWord token filter: SELECT mroonga_command("register token_filters/stop_word"); -- +-----------------------------------------------------+ -- | mroonga_command("register token_filters/stop_word") | -- +-----------------------------------------------------+ -- | true | -- +-----------------------------------------------------+ -- 1 row in set (0.00 sec) CREATE TABLE terms ( term VARCHAR(64) NOT NULL PRIMARY KEY, is_stop_word BOOL NOT NULL ) Engine=Mroonga COMMENT='tokenizer "TokenBigram", token_filters "TokenFilterStopWord"' DEFAULT CHARSET=utf8; -- Query OK, 0 rows affected (0.12 sec) CREATE TABLE memos ( id INT NOT NULL PRIMARY KEY, content TEXT NOT NULL, FULLTEXT INDEX (content) COMMENT 'table "terms"' ) Engine=Mroonga DEFAULT CHARSET=utf8; -- Query OK, 0 rows affected (0.17 sec) INSERT INTO terms VALUES ("and", true); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (1, "Hello"); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (2, "Hello and Good-bye"); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (3, "Good-bye"); -- Query OK, 1 row affected (0.00 sec) SELECT * FROM memos WHERE MATCH (content) AGAINST ('+"Hello and"' IN BOOLEAN MODE); -- +----+--------------------+ -- | id | content | -- +----+--------------------+ -- | 1 | Hello | -- | 2 | Hello and Good-bye | -- +----+--------------------+ -- 2 rows in set (0.01 sec) It's used that specifying the lexicon table for fulltext search. How to specify Groonga's column flags Mroonga has the following syntax to specify Groonga's column flags: content TEXT COMMENT 'flags "COLUMN_SCALAR|COMPRESS_ZLIB"' Here is an example that uses COMPRESS_ZLIB flag: CREATE TABLE entries ( id INT UNSIGNED PRIMARY KEY, content TEXT COMMENT 'flags "COLUMN_SCALAR|COMPRESS_ZLIB"' ) Engine=Mroonga DEFAULT CHARSET=utf8; -- Query OK, 0 rows affected (0.12 sec) See Groonga's documentation about column flags for more details. How to use geolocation search In storage mode, you can use fast geolocation search in addition to full text search. But unlike MyISAM, you can only store POINT type data. You cannot store other types data like LINE. And fast search us- ing index only supports MBRContains. It does not support MBRDisjoint. For the table definition for geolocation search, you need to define a POINT type column like in MyISAM and define SPATIAL INDEX for it.: CREATE TABLE shops ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255), location POINT NOT NULL, SPATIAL INDEX (location) ) ENGINE = Mroonga; -- Query OK, 0 rows affected (0.06 sec) To store data, you create POINT type data by using geomFromText() func- tion like in MyISAM: INSERT INTO shops VALUES (null, 'Nezu''s Taiyaki', GeomFromText('POINT(139.762573 35.720253)')); -- Query OK, 1 row affected (0.00 sec) INSERT INTO shops VALUES (null, 'Naniwaya', GeomFromText('POINT(139.796234 35.730061)')); -- Query OK, 1 row affected (0.00 sec) INSERT INTO shops VALUES (null, 'Yanagiya Taiyaki', GeomFromText('POINT(139.783981 35.685341)')); -- Query OK, 1 row affected (0.00 sec) If you want to find shops within the rectangle where Ikebukuro station (139.7101 35.7292) is the top-left point and Tokyo Station (139.7662 35.6815) is the bottom-right point, SELECT phrase is like the follow- ing: SELECT id, name, AsText(location) FROM shops WHERE MBRContains(GeomFromText('LineString(139.7101 35.7292, 139.7662 35.6815)'), location); -- +----+-----------------------+------------------------------------------+ -- | id | name | AsText(location) | -- +----+-----------------------+------------------------------------------+ -- | 1 | Nezu's Taiyaki | POINT(139.762572777778 35.7202527777778) | -- +----+-----------------------+------------------------------------------+ -- 1 row in set (0.00 sec) Here you can search by geolocation! How to get the record ID Groonga assigns a unique number to identify the record when a record is added in the table. To make the development of applications easier, you can get this record ID by SQL in Mroonga To get the record ID, you need to create a column named _id when you create a table: CREATE TABLE memos ( _id INT, content VARCHAR(255), UNIQUE KEY (_id) USING HASH ) ENGINE = Mroonga; -- Query OK, 0 rows affected (0.04 sec) Tye typo of _id column should be integer one (TINYINT, SMALLINT, MEDIU- MINT, INT or BIGINT). You can create an index for _id column, but it should be HASH type. Let's add records in the table by INSERT. Since _id column is imple- mented as a virtual column and its value is assigned by Groonga, you cannot specify the value when updating. So you need to exclude it from setting columns, or you need to use null as its value: INSERT INTO memos VALUES (null, "Saury for today's dinner."); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (null, "Update mroonga tomorrow."); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (null, "Buy some dumpling on the way home."); -- Query OK, 1 row affected (0.00 sec) INSERT INTO memos VALUES (null, "Thank God It's meat day."); -- Query OK, 1 row affected (0.00 sec) To get the record ID, you invoke SELECT with _id column: SELECT * FROM memos; -- +------+------------------------------------------+ -- | _id | content | -- +------+------------------------------------------+ -- | 1 | Saury for today's dinner. | -- | 2 | Update mroonga tomorrow. | -- | 3 | Buy some dumpling on the way home. | -- | 4 | Thank God It's meat day. | -- +------+------------------------------------------+ -- 4 rows in set (0.00 sec) By using last_insert_grn_id function, you can also get the record ID that is assigned by the last INSERT: INSERT INTO memos VALUES (null, "Just one bottle of milk in the fridge."); -- Query OK, 1 row affected (0.00 sec) SELECT last_insert_grn_id(); -- +----------------------+ -- | last_insert_grn_id() | -- +----------------------+ -- | 5 | -- +----------------------+ -- 1 row in set (0.00 sec) last_insert_grn_id function is included in Mroonga as a User-Defined Function (UDF), but if you have not yet register it in MySQL by CREATE FUNCTION, you need to invoke the following SQL for defining a function: CREATE FUNCTION last_insert_grn_id RETURNS INTEGER SONAME 'ha_mroonga.so'; As you can see in the example above, you can get the record ID by _id column or last_insert_grn_id function. It will be useful to use this value in the ensuing SQL queries like UPDATE: UPDATE memos SET content = "So much milk in the fridge." WHERE _id = last_insert_grn_id(); -- Query OK, 1 row affected (0.00 sec) -- Rows matched: 1 Changed: 1 Warnings: 0 How to get snippet (Keyword in context) Mroonga provides functionality to get keyword in context. It is imple- mented as mroonga_snippet() UDF. How to use similar search Similar search is supported by specifying document itself instead of specifying keywords in query. It is useful if you want to find documents which is related to specific document. Here is the schema definition for execution examples: CREATE TABLE similarities ( id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR(32), content VARCHAR(255), FULLTEXT INDEX (content) ) ENGINE = Mroonga DEFAULT CHARSET utf8; Here is the sample data for execution examples: INSERT INTO similarities (title, content) VALUES ('Groonga similar search', 'Groonga is an open-source fulltext search engine and column store.'); INSERT INTO similarities (title, content) VALUES ('Mroonga similar search', 'Mroonga is an open-source storage engine for fast fulltext search with MySQL.'); INSERT INTO similarities (title, content) VALUES ('Rroonga library', 'A library to use Groonga features from Ruby.'); Here is the example of similar search by content itself. SELECT title FROM similarities WHERE MATCH(content) AGAINST ('There are many open-source fulltext search engine.' IN NATURAL LANGUAGE MODE); Note that you need to specify an content of document in AGAINST('...' IN NATURAL LANGUAGE MODE). SELECT title FROM similarities WHERE MATCH(content) AGAINST ('There are many open-source fulltext search engine.' IN NATURAL LANGUAGE MODE); -- +------------------------+ -- | title | -- +------------------------+ -- | Groonga similar search | -- | Mroonga similar search | -- +------------------------+ -- 2 rows in set (0.00 sec) To improve degree of similarity, you may need to use language specific tokenizer or use labeled data such as tag to get rid of some undesir- able search results. For example, if you want to execute similar search against Japanese text, it is recommended to use language specific tokenizer - Token- Mecab. Here is the schema definition to use TokenMecab tokenizer: CREATE TABLE similarities ( id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR(32), content VARCHAR(255), FULLTEXT INDEX (content) COMMENT 'tokenizer "TokenMecab"' ) ENGINE = Mroonga DEFAULT CHARSET utf8; How to run Groonga command In storage mode, Mroonga stores all your data into Groonga database. You can access Groonga database by SQL with Mroonga. SQL is very power- ful but it is not good for some operations such as faceted search. Faceted search is popular recently. Many online shopping sites such as amazon.com and ebay.com support faceted search. Faceted search refines the current search by available search parameters before users refine their search. And faceted search shows refined searches. Users just se- lect a refined search. Users benefit from faceted search: • Users don't need to think about how to refine their search. Users just select a showed refined search. • Users don't get boared "not match" page. Faceted search showes only refined searches that has one or more matched items. Faceted search needs multiple GROUP BY operations against searched re- sult set. To do faceted search by SQL, multiple SELECT requests are needed. It is not effective. Groonga can do faceted search by only one groonga command. It is effec- tive. Groonga has the select command that can search records with faceted search. Faceted search is called as "drilldown" in Groonga. See Groonga's documentation about select command for more details. Mroonga provides mroonga_command() function. You can run Groonga com- mand in SQL by the function. But you should use only select command. Other commands that change schema or data may break consistency. Here is the schema definition for execution examples: CREATE TABLE diaries ( id INT PRIMARY KEY AUTO_INCREMENT, content VARCHAR(255), date DATE, year YEAR, `year_month` VARCHAR(9), tag VARCHAR(32), FULLTEXT INDEX (content) ) ENGINE = Mroonga DEFAULT CHARSET utf8; Here is the sample data for execution examples: INSERT INTO diaries (content, date, year, `year_month`, tag) VALUES ('Groonga is an open-source fulltext search engine and column store.', '2013-04-08', '2013', '2013-04', 'groonga'); INSERT INTO diaries (content, date, year, `year_month`, tag) VALUES ('Mroonga is an open-source storage engine for fast fulltext search with MySQL.', '2013-04-09', '2013', '2013-04', 'MySQL'); INSERT INTO diaries (content, date, year, `year_month`, tag) VALUES ('Tritonn is a patched version of MySQL that supports better fulltext search function with Senna.', '2013-03-29', '2013', '2013-03', 'MySQL'); Each record has groonga or MySQL as tag. Each record also has year and year_month. You can use tag, year and year_month as faceted search keys. Groonga calls faceted search as drilldown. So parameter key in Groonga is --drilldown. Groonga returns search result as JSON. So mroonga_com- mand() also returns search result as JSON. It is not SQL friendly. You need to parse search result JSON by yourself. Here is the example of faceted search by all available faceted search keys (result JSON is pretty printed): SELECT mroonga_command("select diaries --output_columns _id --limit 0 --drilldown tag,year,year_month") AS faceted_result; -- +-----------------------------+ -- | faceted_result | -- +-----------------------------+ -- | [[[3], | -- | [["_id","UInt32"]]], | -- | [[2], | -- | [["_key","ShortText"], | -- | ["_nsubrecs","Int32"]], | -- | ["groonga",1], | -- | ["MySQL",2]], | -- | [[1], | -- | [["_key","Time"], | -- | ["_nsubrecs","Int32"]], | -- | [1356998400.0,3]], | -- | [[2], | -- | [["_key","ShortText"], | -- | ["_nsubrecs","Int32"]], | -- | ["2013-04",2], | -- | ["2013-03",1]]] | -- +-----------------------------+ -- 1 row in set (0.00 sec) The first element [[3], [["_id","UInt32"]]] is normal search result. It's not faceted search result. The second, third and forth elements are faceted search results: [[2], [["_key","ShortText"], ["_nsubrecs","Int32"]], ["groonga",1], ["MySQL",2]] [[1], [["_key","Time"], ["_nsubrecs","Int32"]], [1356998400.0,3]] [[2], [["_key","ShortText"], ["_nsubrecs","Int32"]], ["2013-04",2], ["2013-03",1]] The order of faceted search results is corresponding to the value of --drilldown. In this example, we specified tag, year and year_month as --drilldown value. So the first faceted search result is for tag, the second one is for year and the third one is for year_month. Each faceted search result uses the following format. This is the same as normal search result: [[${THE_NUMBER_OF_RECORDS}], [[${OUTPUT_COLUMN_NAME_0}, ${OUTPUT_COLUMN_TYPE_0}], [${OUTPUT_COLUMN_NAME_1}, ${OUTPUT_COLUMN_TYPE_1}], ..., [${OUTPUT_COLUMN_NAME_N}, ${OUTPUT_COLUMN_TYPE_N}]] [${OUTPUT_COLUMN_VALUE_0_FOR_RECORD_0}, ${OUTPUT_COLUMN_VALUE_1_FOR_RECORD_0}, ..., ${OUTPUT_COLUMN_VALUE_N_FOR_RECORD_0}], [${OUTPUT_COLUMN_VALUE_0_FOR_RECORD_1}, ${OUTPUT_COLUMN_VALUE_1_FOR_RECORD_1}, ..., ${OUTPUT_COLUMN_VALUE_N_FOR_RECORD_1}], ... [${OUTPUT_COLUMN_VALUE_0_FOR_RECORD_M}, ${OUTPUT_COLUMN_VALUE_1_FOR_RECORD_M}, ..., ${OUTPUT_COLUMN_VALUE_N_FOR_RECORD_M}]] The _key column value in faceted search result shows faceted search key. For example, the first faceted search result (the faceted search result for tag) shows that matched records (all records in this case) have groonga and MySQL as tag value: [[2], [["_key","ShortText"], ["_nsubrecs","Int32"]], ["groonga",1], ["MySQL",2]] The _nsubrecs column value in faceted search result shows the number of records that have the corresponding faceted search key. For example, the first faceted search result (the faceted search result for tag) shows that there are 1 record that has groonga as tag value and 2 records that have MySQL as tag value: [[2], [["_key","ShortText"], ["_nsubrecs","Int32"]], ["groonga",1], ["MySQL",2]] See Groonga's documentation about select command for more details. How to search by regular expression In storage mode, you can use a Groonga's functionality from Mroonga as described above. Thus, you can search records by using a regular ex- pression via Groonga's functionality. There are some conditions to use regular expression in Mroonga. • Create an index with TokenRegexp tokenizer which is used in COMMENT • Use *SS pragma and @~ in WHERE MATCH ... AGAINST clause Here is the example of search by regular expression. CREATE TABLE paths ( content text, FULLTEXT INDEX content_index (content) COMMENT 'tokenizer "TokenRegexp", normalizer "NormalizerAuto"' ) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4; INSERT INTO paths VALUES ('/usr/bin/groonga'); INSERT INTO paths VALUES ('/var/log/auth.log'); INSERT INTO paths VALUES ('/var/log/messages'); INSERT INTO paths VALUES ('/tmp/local/var/log/auth.log'); SELECT * FROM paths WHERE MATCH(content) AGAINST ('*SS content @~ "\\\\A/var/log/auth"' IN BOOLEAN MODE); -- +-------------------+ -- | content | -- +-------------------+ -- | /var/log/auth.log | -- +-------------------+ -- 1 row in set (0.024 sec) By using *SS pragma, you can search the records which matches /var/log/auth.log with content @~ "\\\\A/var/log/auth". @~ is a Groonga's operator which executes a regular expression search, and "\\\\A/var/log/auth" executes prefix search, so it matches to only /var/log/auth.log. /tmp/local/var/log/auth.log doesn't match because it doesn't begin with "/var/log/auth". See Groonga's regular expression document for more syntax details. Logging Mroonga outputs the logs by default. Log files are located in MySQL's data directory with the filename groonga.log. Here is the example of the log. 2010-10-07 17:32:39.209379|n|b1858f80|mroonga 1.10 started. 2010-10-07 17:32:44.934048|d|46953940|hash get not found (key=test) 2010-10-07 17:32:44.936113|d|46953940|hash put (key=test) The default log level is NOTICE, i.e. we have important information only and we don't have debug information etc.). You can get the log level by mroonga_log_level system variable, that is a global variable. You can also modify it dynamically by using SET phrase: SHOW VARIABLES LIKE 'mroonga_log_level'; -- +-------------------+--------+ -- | Variable_name | Value | -- +-------------------+--------+ -- | mroonga_log_level | NOTICE | -- +-------------------+--------+ -- 1 row in set (0.00 sec) SET GLOBAL mroonga_log_level=DUMP; -- Query OK, 0 rows affected (0.00 sec) SHOW VARIABLES LIKE 'mroonga_log_level'; -- +-------------------+-------+ -- | Variable_name | Value | -- +-------------------+-------+ -- | mroonga_log_level | DUMP | -- +-------------------+-------+ -- 1 row in set (0.00 sec) Available log levels are the followings. • NONE • EMERG • ALERT • CRIT • ERROR • WARNING • NOTICE • INFO • DEBUG • DUMP See mroonga_log_level about details. You can reopen the log file by FLUSH LOGS. If you want to rotate the log file without stopping MySQL server, you can do in the following procedure. 1. change the file name of groonga.log (by using OS's mv command etc.). 2. invoke "FLUSH LOGS" in MySQL server (by mysql command or mysqladmin command). Next step Now, you can use Mroonga as storage mode! If you want Mroonga to be faster, see also Optimizations. FOOTNOTES [1] MySQL 5.1 Reference Manual :: 11 Functions and Operations :: 11.7 Full-Text Search Functions [2] In Groonga, we call it a 'tokenizer'. Wrapper mode Here we explain how to use wrapper mode of Mroonga How to use wrapper mode In wrapper mode, Mroonga works in wrapping an existing storage engine. To specify the wrapped storage engine, we use SQL comment like COMMENT = 'engine "InnoDB"' for now. NOTE: For now, a primary key is mandatory in wrapper mode. That is not the case with storage mode. NOTE: Wrapper mode supports the followings, that are not supported in storage mode for now. • null value • transaction (if storage engine supports. Note that rollback causes mismatch of indexes, it may affects search results, so recreate index of Mroonga in such a case.) How to use full text search After confirming the installation, let's create a table. The important point is to specify Mroonga by ENGINE = Mroonga. mysql> CREATE TABLE diaries ( -> id INT PRIMARY KEY AUTO_INCREMENT, -> content VARCHAR(255), -> FULLTEXT INDEX (content) -> ) ENGINE = Mroonga COMMENT = 'engine "InnoDB"' DEFAULT CHARSET utf8; Query OK, 0 rows affected (0.52 sec) We put data by INSERT. mysql> INSERT INTO diaries (content) VALUES ("It'll be fine tomorrow."); Query OK, 1 row affected (0.01 sec) mysql> INSERT INTO diaries (content) VALUES ("It'll rain tomorrow"); Query OK, 1 row affected (0.00 sec) Try full text search. mysql> SELECT * FROM diaries WHERE MATCH(content) AGAINST("+fine" IN BOOLEAN MODE); +----+-----------------------------------------+ | id | content | +----+-----------------------------------------+ | 1 | It'll be fine tomorrow. | +----+-----------------------------------------+ 1 row in set (0.00 sec) Yes, full text search works. How to get search score We often want to display more relevant results first in full text search. We use search score in such case. We can get search score by MySQL's standard way [1], i.e. we use MATCH...AGAINST in one of columns in SELECT or ORDER BY. Let's try. mysql> INSERT INTO diaries (content) VALUES ("It's fine today. It'll be fine tomorrow as well."); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO diaries (content) VALUES ("It's fine today. But it'll rain tomorrow."); Query OK, 1 row affected (0.00 sec) mysql> SELECT *, MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) FROM diaries WHERE MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) ORDER BY MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) DESC; +----+--------------------------------------------------+---------------------------------------------------+ | id | content | MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) | +----+--------------------------------------------------+---------------------------------------------------+ | 3 | It's fine today. It'll be fine tomorrow as well. | 2 | | 1 | It'll be fine tomorrow. | 1 | | 4 | It's fine today. But it'll rain tomorrow. | 1 | +----+--------------------------------------------------+---------------------------------------------------+ 3 rows in set (0.00 sec) The result having the search word fine more, i.e. id = 3 message having the higher search score, is displayed first. And you also get search score by using MATCH AGAINST in SELECT phrase. You can use AS to change the attribute name. mysql> SELECT *, MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) AS score FROM diaries WHERE MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) ORDER BY MATCH (content) AGAINST ("+fine" IN BOOLEAN MODE) DESC; +----+--------------------------------------------------+-------+ | id | content | score | +----+--------------------------------------------------+-------+ | 3 | It's fine today. It'll be fine tomorrow as well. | 2 | | 1 | It'll be fine tomorrow. | 1 | | 4 | It's fine today. But it'll rain tomorrow. | 1 | +----+--------------------------------------------------+-------+ 3 rows in set (0.00 sec) How to specify the parser for full text search MySQL has the following syntax to specify the parser [2] for full text search. FULLTEXT INDEX (content) WITH PARSER parser_name To use this syntax, you need to register all parsers in MySQL before- hand. On the other hand, Groonga can dynamically add a tokenizer, that is a parser in MySQL. So if use this syntax in Mroonga, tokenizers that are added in Groonga dynamically cannot be supported. We think that this limitation decreases the convenience, and we choose our own syntax using COMMENT like the following. FULLTEXT INDEX (content) COMMENT 'tokenizer "TokenMecab"' NOTE: COMMENT in FULLTEXT INDEX is only supported MySQL 5.5 or later. If you use MySQL 5.1, use mroonga_default_parser variable described be- low. You can specify one of following values as the tokenizer. Available tokenizers +----------------------------+----------------------------+ | Tokenizer | Description | +----------------------------+----------------------------+ | off | (Deprecated) It does not | | | tokenize at all. Use off | | | if you want to treat con- | | | tent as is. For example, | | | this value is used for | | | prefix search. | +----------------------------+----------------------------+ | none | Added in version 8.07: It | | | does not tokenize at all. | | | Use none if you want to | | | treat content as is. For | | | example, this value is | | | used for prefix search. | +----------------------------+----------------------------+ | TokenBigram | It tokenises in bigram. | | | But continuous alphabets, | | | numbers or symbols are | | | treated as a token. So | | | there can exist tokes with | | | 3 letters or more. It is | | | to reduce noises. | | | | | | This is the default value. | +----------------------------+----------------------------+ | TokenMecab | It tokenises using MeCab. | | | Groonga should be built | | | with MeCab support. | +----------------------------+----------------------------+ | TokenBigramSplitSymbol | It tokenises in bigram. | | | Unlike TokenBigram, con- | | | tinuous symbols are not | | | treated as a token, but | | | tokenised in bigram. | | | | | | When you use TokenBigram- | | | SplitSymbol instead of To- | | | kenBigram, "!?" can match | | | "!?!?!?" in "Is it re- | | | ally!?!?!?". But when you | | | use TokenBigram, only | | | "!?!?!?" can match as | | | well. | +----------------------------+----------------------------+ | TokenBigramSplitSymbolAl- | It tokenise in bigram. In | | pha | addition to TokenBigram- | | | SplitSymbol, continuous | | | alphabets are not treated | | | as a token either, but to- | | | kenised in bigram. | | | | | | When you use TokenBigram- | | | SplitSymbolAlpha instead | | | of TokenBigram, "real" can | | | match "Is it really?". But | | | when you use TokenBigram, | | | only "really" can match as | | | well. | +----------------------------+----------------------------+ | TokenBigramSplitSymbolAl- | It tokenise in bigram. In | | phaDigit | addition to TokenBigram- | | | SplitSymbolAlpha, continu- | | | ous numbers are not | | | treated as a token either, | | | but tokenised in bigram. | | | So any kind of characters | | | are treated equally in bi- | | | gram. | | | | | | When you use TokenBigram- | | | SplitSymbolAlphaDigit in- | | | stead of TokenBigram, | | | "567" can match | | | "090-0123-4567". But when | | | you use TokenBigram, only | | | "4567" can match as well. | +----------------------------+----------------------------+ | TokenBigramIgnoreBlank | It tokenise in bigram. Un- | | | like TokenBigram, it ig- | | | nores white spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlank instead of | | | TokenBigram, "" can match | | | " ". But when you use | | | TokenBigram, only " " | | | can match as well. | +----------------------------+----------------------------+ | TokenBigramIgnore- | It tokenise in bigram. Un- | | BlankSplitSymbol | like TokenBigramSplitSym- | | | bol, it ignores white | | | spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlankSplitSymbol | | | instead of TokenBigram- | | | SplitSymbol, "???" can | | | match "! ? ???". But when | | | you use TokenBigramSplit- | | | Symbol, only "? ??" can | | | match as well. | +----------------------------+----------------------------+ | TokenBigramIgnore- | It tokenise in bigram. Un- | | BlankSplitSymbolAlpha | like TokenBigramSplitSym- | | | bolAlpha, it ignores white | | | spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlankSplitSymbo- | | | lAlpha instead of TokenBi- | | | gramSplitSymbolAlpha, | | | "ama" can match "I am a | | | pen.". But when you use | | | TokenBigramSplitSymbolAl- | | | pha, only "am a" can match | | | as well. | +----------------------------+----------------------------+ | TokenBigramIgnore- | It tokenise in bigram. Un- | | BlankSplitSymbolAlphaDigit | like TokenBigramSplitSym- | | | bolAlphaDigit, it ignores | | | white spaces. | | | | | | When you use TokenBi- | | | gramIgnoreBlankSplitSymbo- | | | lAlphaDigit instead of To- | | | kenBigramSplitSymbolAl- | | | phaDigit, "9001" can match | | | "090 0123 4567". But when | | | you use TokenBigramSplit- | | | SymbolAlphaDigit, only "90 | | | 01" can match as well. | +----------------------------+----------------------------+ | TokenDelimit | It tokenise by splitting | | | with a white space. | | | | | | "movie horror topic" will | | | be tokenised as "movie", | | | "horror", "topic". | +----------------------------+----------------------------+ | TokenDelimitNull | It tokenise by splitting | | | with a null character | | | (\\0). | | | | | | "movie\\0horror\\0topic" | | | will be tokenised as | | | "movie", "horror", | | | "topic". | +----------------------------+----------------------------+ | TokenUnigram | It tokenises in unigram. | | | But continuous alphabets, | | | numbers or symbols are | | | treated as a token. So | | | there can exist tokes with | | | 2 letters or more. It is | | | to reduce noises. | +----------------------------+----------------------------+ | TokenTrigram | It tokenises in trigram. | | | But continuous alphabets, | | | numbers or symbols are | | | treated as a token. So | | | there can exist tokes with | | | 4 letters or more. It is | | | to reduce noises. | +----------------------------+----------------------------+ You can specify the default tokenizer by passing --with-default-tok- enizer option in configure when you build Mroonga. ./configure --with-default-tokenizer TokenMecab ... Or you can set mroonga_default_tokenizer variable in my.cnf or by SQL. If you specify it in my.cnf, the change will not be lost after restart- ing MySQL, but you need to restart MySQL to make it effective. On the other hand, if you set it in SQL, the change is effective immediately, but it will be lost when you restart MySQL. my.cnf: [mysqld] mroonga_default_tokenizer=TokenMecab SQL: mysql> SET GLOBAL mroonga_default_tokenizer = TokenMecab; Query OK, 0 rows affected (0.00 sec) How to specify the normalizer Mroonga uses normalizer corresponding to the encoding of document. It is used when tokenizing text and storing table key. It is used NormalizerMySQLGeneralCI normalizer when the encoding is utf8_general_ci or utf8mb4_general_ci. It is used NormalizerMySQLUnicodeCI normalizer when the encoding is utf8_unicode_ci or utf8mb4_unicode_ci. It isn't used normalizer when the encoding is utf8_bin. Here is an example that uses NormalizerMySQLUnicodeCI normalizer by specifying utf8_unicode_ci.: mysql> SET NAMES utf8; Query OK, 0 rows affected (0.00 sec) mysql> CREATE TABLE diaries ( -> day DATE PRIMARY KEY, -> content VARCHAR(64) NOT NULL, -> FULLTEXT INDEX (content) -> ) Engine=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Query OK, 0 rows affected (0.18 sec) mysql> INSERT INTO diaries VALUES ("2013-04-23", " "); Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM diaries -> WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); +------------+-----------------------------------------+ | day | content | +------------+-----------------------------------------+ | 2013-04-23 | | +------------+-----------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM diaries -> WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); +------------+-----------------------------------------+ | day | content | +------------+-----------------------------------------+ | 2013-04-23 | | +------------+-----------------------------------------+ 1 row in set (0.00 sec) Mroonga has the following syntax to specify Groonga's normalizer: FULLTEXT INDEX (content) COMMENT 'normalizer "NormalizerAuto"' See Groonga's documentation about normalizer for more details. Here is an example that uses NormalizerAuto normalizer: mysql> SET NAMES utf8; Query OK, 0 rows affected (0.00 sec) mysql> CREATE TABLE diaries ( -> day DATE PRIMARY KEY, -> content VARCHAR(64) NOT NULL, -> FULLTEXT INDEX (content) COMMENT 'normalizer "NormalizerAuto"' -> ) Engine=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Query OK, 0 rows affected (0.19 sec) mysql> INSERT INTO diaries VALUES ("2013-04-23", " "); Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM diaries -> WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); Empty set (0.00 sec) mysql> SELECT * FROM diaries -> WHERE MATCH (content) AGAINST ("+" IN BOOLEAN MODE); +------------+-----------------------------------------+ | day | content | +------------+-----------------------------------------+ | 2013-04-23 | | +------------+-----------------------------------------+ 1 row in set (0.00 sec) How to get snippet (Keyword in context) Mroonga provides functionality to get keyword in context. It is imple- mented as mroonga_snippet() UDF. Logging Mroonga outputs the logs by default. Log files are located in MySQL's data directory with the filename groonga.log. Here is the example of the log. 2010-10-07 17:32:39.209379|n|b1858f80|mroonga 1.10 started. 2010-10-07 17:32:44.934048|d|46953940|hash get not found (key=test) 2010-10-07 17:32:44.936113|d|46953940|hash put (key=test) The default log level is NOTICE, i.e. we have important information only and we don't have debug information etc.). You can get the log level by mroonga_log_level system variable, that is a global variable. You can also modify it dynamically by using SET phrase. mysql> SHOW VARIABLES LIKE 'mroonga_log_level'; +-------------------+--------+ | Variable_name | Value | +-------------------+--------+ | mroonga_log_level | NOTICE | +-------------------+--------+ 1 row in set (0.00 sec) mysql> SET GLOBAL mroonga_log_level=DUMP; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'mroonga_log_level'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | mroonga_log_level | DUMP | +-------------------+-------+ 1 row in set (0.00 sec) Available log levels are the followings. • NONE • EMERG • ALERT • CRIT • ERROR • WARNING • NOTICE • INFO • DEBUG • DUMP See mroonga_log_level about details. You can reopen the log file by FLUSH LOGS. If you want to rotate the log file without stopping MySQL server, you can do in the following procedure. 1. change the file name of groonga.log (by using OS's mv command etc.). 2. invoke "FLUSH LOGS" in MySQL server (by mysql command or mysqladmin command). Next step Now, you can use Mroonga as wrapper mode! If you want Mroonga to be faster, see also Optimizations. FOOTNOTES [1] MySQL 5.1 Reference Manual :: 11 Functions and Operations :: 11.7 Full-Text Search Functions [2] In Groonga, we call it a 'tokenizer'. REFERENCE Optimizations Mroonga implemented some optimizations to return response faster. Some optimizations can work only on Storage mode. Fetching only needed columns This optimization can work only on Storage mode. Groonga uses column store architecture. It means that Groonga doesn't need to fetch values of all columns for fetching a column value in a row. Groonga can fetch values of only needed columns. InnoDB and MyISAM use row store architecture. They need to fetch values of all columns for fetching a column value in a row. If you specify only columns you needed in SELECT, Mroonga just fetches only values of these columns. Mroonga doesn't fetch values of other columns. Mroonga can work faster by reducing operations and I/O. It's this optimization. Here is a sample table definition to describe this optimization: CREATE TABLE t1 ( c1 INT PRIMARY KEY AUTO_INCREMENT, c2 INT, c3 INT, ... c11 VARCHAR(20), c12 VARCHAR(20), ... c20 DATETIME ) ENGINE=Mroonga DEFAULT CHARSET=utf8; Here is a SELECT to describe this optimization: SELECT c1, c2, c11 FROM t1 WHERE c2 = XX AND c12 = "XXX"; In this case, Mroonga fetches values from only c1, c2, c11 and c12. Mroonga doesn't fetch values from c3, c4, ..., c10, c13, ..., c19 and c20. Row count This optimization can work only on Storage mode. MySQL requires all column values from storage engine for processing COUNT(*) even if COUNT(*) doesn't need them. Mroonga doesn't fetch any column values for the case. Mroonga can work faster by reducing operations and I/O. It's this optimization. Here is a SELECT to describe this optimization: SELECT COUNT(*) FROM t1 WHERE MATCH(c2) AGAINST("+keyword" IN BOOLEAN MODE); The SELECT fetches only COUNT(*) and condition in WHERE can be processed only by index. In this case, Mroonga uses this optimization. You can confirm whether this optimization is used or not by looking Mroonga_count_skip status variable: mysql> SHOW STATUS LIKE 'Mroonga_count_skip'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | Mroonga_count_skip | 1 | +--------------------+-------+ 1 row in set (0.00 sec) Mroonga_count_skip status variable is incremented when Mroonga uses this optimization. You can disable this optimization by setting mroonga_enable_optimization to false. ORDER BY LIMIT This optimization can work on both Storage mode and Wrapper mode. MySQL can process ORDER BY and LIMIT with low cost if you can get sorted records by index even if the number of matched records is very big. MySQL can do the process for MATCH() AGAINST(IN NATURAL LANGUAGE MODE). But MySQL can't do the process for MATCH() AGAINST(IN BOOLEAN MODE). It means that MySQL might take long time for MATCH() AGAINST(IN BOOLEAN MODE) that matches with many records. Mroonga processes ORDER BY and LIMIT by Groonga and returns only target records to MySQL. It's very faster for query that matches with many records. It's this optimization. Here is a SELECT to describe this optimization: SELECT * FROM t1 WHERE MATCH(c2) AGAINST("+keyword" IN BOOLEAN MODE) ORDER BY c1 LIMIT 1; The SELECT runs full text search and sorts by Groonga and returns only one record to MySQL. You can confirm whether this optimization is used or not by looking Mroonga_fast_order_limit status variable: mysql> SHOW STATUS LIKE 'Mroonga_fast_order_limit'; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | Mroonga_fast_order_limit | 1 | +--------------------------+-------+ 1 row in set (0.00 sec) Mroonga_fast_order_limit status variable is incremented when Mroonga uses this optimization. This optimization is used only when all the following conditions are true: • Storage mode: WHERE clause has one MATCH AGAINST and zero or more arithmetic operations such as column < 100. • Wrapper mode: WHERE clause has only MATCH AGAINST. • No JOIN • No GROUP BY • No SQL_CALC_FOUND_ROWS • With LIMIT • Storage mode: ORDER BY clause has only columns or MATCH AGAINST that is used in WHERE clause. • Wrapper mode: ORDER BY clause has only primary column or MATCH AGAINST that is used in WHERE clause. SQL commands This section describes avaiable SQL commands and unavailable SQL com- mands. List of available SQL commands • ALTER TABLE table_name ADD COLUMN column_name TEXT • ALTER TABLE table_name ADD FULLTEXT INDEX index_name (column_name) • ALTER TABLE table_name ADD SPATIAL KEY index_name (geometry_col- umn_name) • ALTER TABLE table_name ENGINE = mroonga • ALTER TABLE table_name RENAME new_table_name • COMMIT • CREATE FULLTEXT INDEX index_name ON table_name(column_name) • CREATE FULLTEXT INDEX index_name ON table_name(column_name, ...) • CREATE TABLE IF NOT EXISTS table_name (...) • CREATE TABLE \`table-name-with-hyphen\` (...) • CREATE TABLE table_name (... FULLTEXT INDEX index_name (column_name) COMMENT 'tokenizer "TokenMecab"') • CREATE TABLE table_name (... FULLTEXT INDEX index_name (column_name) COMMENT 'normalizer "NormalizerMySQLUnicodeCIExceptKanaCIKanaWith- VoicedSoundMark"') • CREATE TABLE table_name (... FULLTEXT INDEX index_name (column_name) COMMENT 'tokenizer "TokenDelimit", normalizer "NormalizerMySQLUni- codeCIExceptKanaCIKanaWithVoicedSoundMark"') • CREATE TABLE table_name (... FULLTEXT INDEX index_name (column_name)) • CREATE TABLE table_name (... FULLTEXT INDEX index_name (column_name, ...)) • CREATE TABLE table_name (... INDEX (column_name)) • CREATE TABLE table_name (... INDEX (column_name, ...)) • CREATE TABLE table_name (... INDEX USING BTREE (column_name)) • CREATE TABLE table_name (... PRIMARY KEY (column_name)) • CREATE TABLE table_name (...) CHARSET ASCII • CREATE TABLE table_name (...) CHARSET BINARY • CREATE TABLE table_name (...) CHARSET CP932 • CREATE TABLE table_name (...) CHARSET EUCJPMS • CREATE TABLE table_name (...) CHARSET KOI8R • CREATE TABLE table_name (...) CHARSET LATIN1 • CREATE TABLE table_name (...) CHARSET SJIS • CREATE TABLE table_name (...) CHARSET UJIS • CREATE TABLE table_name (...) CHARSET UTF8 • CREATE TABLE table_name (...) CHARSET UTF8MB4 • CREATE TABLE table_name (...) ENGINE=mroonga COMMENT = 'ENGINE "Inn- oDB"' • CREATE TABLE table_name (...) ENGINE=mroonga • CREATE TABLE table_name (\`_id\` INT) • CREATE TABLE table_name (\`_id\` INT, KEY(_id) USING HASH) • CREATE TABLE table_name (column_name BIGINT UNSIGNED) • CREATE TABLE table_name (column_name BIGINT) • CREATE TABLE table_name (column_name BINARY(...)) • CREATE TABLE table_name (column_name BIT) • CREATE TABLE table_name (column_name BLOB) • CREATE TABLE table_name (column_name CHAR(...)) • CREATE TABLE table_name (column_name DATE) • CREATE TABLE table_name (column_name DATETIME) • CREATE TABLE table_name (column_name DECIMAL) • CREATE TABLE table_name (column_name DOUBLE) • CREATE TABLE table_name (column_name ENUM(...)) • CREATE TABLE table_name (column_name FLOAT) • CREATE TABLE table_name (column_name GEOMETRY NOT NULL) • CREATE TABLE table_name (column_name INT UNSIGNED) • CREATE TABLE table_name (column_name INT) • CREATE TABLE table_name (column_name LONGBLOB) • CREATE TABLE table_name (column_name LONGTEXT) • CREATE TABLE table_name (column_name MEDIUMBLOB) • CREATE TABLE table_name (column_name MEDIUMINT UNSIGNED) • CREATE TABLE table_name (column_name MEDIUMINT) • CREATE TABLE table_name (column_name MEDIUMTEXT) • CREATE TABLE table_name (column_name SET(...)) • CREATE TABLE table_name (column_name SMALLINT UNSIGNED) • CREATE TABLE table_name (column_name SMALLINT) • CREATE TABLE table_name (column_name TEXT) • CREATE TABLE table_name (column_name TIME) • CREATE TABLE table_name (column_name TIMESTAMP) • CREATE TABLE table_name (column_name TINYBLOB) • CREATE TABLE table_name (column_name TINYINT UNSIGNED) • CREATE TABLE table_name (column_name TINYINT) • CREATE TABLE table_name (column_name TINYTEXT) • CREATE TABLE table_name (column_name VARBINARY(...)) • CREATE TABLE table_name (column_name VARCHAR(...)) • CREATE TABLE table_name (column_name YEAR) • CREATE TABLE table_name (column_name INT AUTO_INCREMENT) • CREATE TABLE table_name (column_name INT PRIMARY KEY) • CREATE TABLE table_name (column_name DATE PRIMARY KEY) • CREATE TABLE table_name (column_name DATETIME PRIMARY KEY) • CREATE TABLE table_name (column_name DATETIME(fractional_seconds_pre- cision) PRIMARY KEY) • CREATE TABLE table_name (column_name DECIMAL PRIMARY KEY) • CREATE TABLE table_name (column_name DECIMAL(maximum_number_digits, fractional_seconds_precision) PRIMARY KEY) • CREATE TABLE table_name (column_name TIME PRIMARY KEY) • CREATE TABLE table_name (column_name TIME(fractional_seconds_preci- sion) PRIMARY KEY) • CREATE TABLE table_name (column_name TIMESTAMP PRIMARY KEY) • CREATE TABLE table_name (column_name TIMESTAMP(fractional_sec- onds_precision) PRIMARY KEY) • CREATE TABLE table_name (column_name YEAR PRIMARY KEY) • CREATE TABLE table_name (column_name1 INT, column_name2 ..., KEY (column_name1, column_name2)) • CREATE TABLE table_name (column_name1 DOUBLE, column_name2 ..., KEY (column_name1, column_name2)) • CREATE TABLE table_name (column_name1 FLOAT, column_name2 ..., KEY (column_name1, column_name2)) • CREATE TABLE table_name (column_name1 CHAR(...), column_name2 ..., KEY (column_name1, column_name2)) • CREATE TABLE table_name (column_name1 VARCHAR(...), column_name2 ..., KEY (column_name1, column_name2)) • CREATE TEMPORARY TABLE table_name (...) • DELETE FROM table_name WHERE ... • DROP INDEX column_name ON table_name • DROP TABLE IF EXISTS table_name, ... • DROP TABLE \`table-name-with-hyphen\`, ... • DROP TABLE table_name, ... • DROP TEMPORARY TABLE table_name • FLUSH LOGS • FLUSH TABLES • INSERT INTO (geometry_column_name) VALUES (GeomFrom- Text('POINT(...)')) • INSERT INTO ... VALUES ... • INSERT INTO ... VALUES ... ON DUPLICATE KEY UPDATE ... (for PRIMARY KEY) • INSERT INTO ... VALUES ... ON DUPLICATE KEY UPDATE ... (for UNIQUE KEY) • INSERT INTO table_name (column_name, ...) SELECT ... FROM other_ta- ble_name • REPLACE INTO table_name SELECT ... FROM other_table_name ... • SELECT * FROM information_schema.plugins WHERE plugin_name = "mroonga" • SELECT * FROM table_name • SELECT * FROM table_name FORCE INDEX(index_name) WHERE ... • SELECT * FROM table_name ORDER BY column_name ASC LIMIT ... • SELECT * FROM table_name ORDER BY column_name ASC • SELECT * FROM table_name ORDER BY column_name DESC LIMIT ... • SELECT * FROM table_name ORDER BY column_name DESC • SELECT * FROM table_name ORDER BY column_name • SELECT * FROM table_name WHERE MATCH(column_name) AGAINST("*D+ word ..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MATCH(column_name) AGAINST("*D- word ..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MATCH(column_name) AGAINST("*DOR word ..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MATCH(column_name) AGAINST("*W word ..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MATCH(column_name) AGAINST("+word ..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MATCH(column_name) AGAINST("..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MATCH(column_name, ...) AGAINST("..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE MBRContains(GeomFrom- Text('LineString(...)'), geometry_column_name) • SELECT * FROM table_name WHERE NOT MATCH(column_name) AGAINST("..." IN BOOLEAN MODE) • SELECT * FROM table_name WHERE column_name < ... • SELECT * FROM table_name WHERE column_name <= ... • SELECT * FROM table_name WHERE column_name = ... • SELECT * FROM table_name WHERE column_name > ... • SELECT * FROM table_name WHERE column_name >= ... • SELECT * FROM table_name WHERE column_name BETWEEN ... AND ... • SELECT * FROM table_name WHERE column_name IN (SELECT sub_column_name FROM sub_table_name WHERE MATCH(sub_text_column_name) AGAINST("...")) • SELECT *, MATCH(column_name) AGAINST("..." IN BOOLEAN MODE) FROM ta- ble_name WHERE MATCH(column_name) AGAINST("..." IN BOOLEAN MODE) • SELECT FOUND_ROWS() • SELECT SQL_CALC_FOUND_ROWS * FROM table_name WHERE MATCH(...) AGAINST("..." IN BOOLEAN MODE) ORDER BY column_name LIMIT start,n_records • SELECT last_insert_grn_id() • SET binlog_format="MIXED" • SET binlog_format="ROW" • SET binlog_format="STATEMENT" • TRUNCATE TABLE table_name • UPDATE table_name SET column_name = ... List of unavailable SQL commands • CREATE TABLE (...) CHARSET not_listed_charset_above • INSERT INTO (geometry_column_name) VALUES (GeomFrom- Text('LineString(...)')) • INSERT INTO (...) VALUES (null) • START TRANSACTION Basically, The character set which Groonga supports (EUC-JP/UTF-8/SJIS/LATIN1/KOI8R) is also supported in Mroonga. I found an unlisted SQL command If you find unlisted SQL command and confirm whether the SQL command works well or not, we want to list it. Please mail it to us or send a patch against the source file by pull request system on GitHub. Server variables Here are the explanations of server variables that are introduced by Mroonga. mroonga_action_on_fulltext_query_error The default behavior of fulltext query error. The default value of mroonga_action_on_fulltext_query_error is ER- ROR_AND_LOG. This is the conventional behavior which is equal to old version of mroonga. Here is the list of mroonga_action_on_fulltext_query_error which you can use. +----------------+----------------------------+ | Value | Description | +----------------+----------------------------+ | ERROR | Report an error. Logging | | | is disabled. | +----------------+----------------------------+ | ERROR_AND_LOG | Report an error. Logging | | | is enabled. (This is the | | | default) | +----------------+----------------------------+ | IGNORE | Just ignore an error. Log- | | | ging is disabled. | +----------------+----------------------------+ | IGNORE_AND_LOG | Ignore an error, but log- | | | ging is enabled. (Similar | | | to InnoDB behavior) | +----------------+----------------------------+ Here is an example SQL to confirm the value of mroonga_action_on_full- text_query_error: mysql> SHOW VARIABLES LIKE 'mroonga_action_on_fulltext_query_error'; +----------------------------------------+---------------+ | Variable_name | Value | +----------------------------------------+---------------+ | mroonga_action_on_fulltext_query_error | ERROR_AND_LOG | +----------------------------------------+---------------+ 1 row in set (0.00 sec) mroonga_boolean_mode_syntax_flags The flags to custom syntax in MATCH () AGAINST ('...' IN BOOLEAN MODE). This variable is system and session variable. Here are available flags: +-------------------+----------------------------+ | Flag | Description | +-------------------+----------------------------+ | DEFAULT | Equals to SYNTAX_QUERY,AL- | | | LOW_LEADING_NOT. | +-------------------+----------------------------+ | SYNTAX_QUERY | Uses query syntax in | | | Groonga. Query syntax in | | | Groonga is a compatible | | | syntax with MySQL's | | | BOOLEAM MODE syntax. | | | | | | If neither SYNTAX_QUERY | | | nor SYNTAX_SCRIPT aren't | | | specified, SYNTAX_QUERY is | | | used. | +-------------------+----------------------------+ | SYNTAX_SCRIPT | Uses script syntax in | | | Groonga. | | | | | | It's JavaScript like syn- | | | tax. You can use full | | | Groonga features with this | | | syntax. | | | | | | If both SYNTAX_QUERY and | | | SYNTAX_SCRIPT are speci- | | | fied, SYNTAX_SCRIPT is | | | used. | +-------------------+----------------------------+ | ALLOW_COLUMN | Allows COLUMN:... syntax | | | in query syntax. It's not | | | compatible with MySQL's | | | BOOLEAM MODE syntax. | | | | | | You can use multiple in- | | | dexes in one MATCH () | | | AGAINST () with this syn- | | | tax. MySQL can use only | | | one index in a query. You | | | can avoid the restriction | | | by this syntax. | | | | | | You can use not only | | | full-text search operation | | | but also other more opera- | | | tions such as equal opera- | | | tion and prefix search op- | | | eration with this syntax. | | | | | | See query syntax in | | | Groonga for details. | +-------------------+----------------------------+ | ALLOW_UPDATE | Allows updating value by | | | COLUMN:=NEW_VALUE syntax | | | in query syntax. | +-------------------+----------------------------+ | ALLOW_LEADING_NOT | Allows -NOT_INCLUDED_KEY- | | | WORD ... syntax in query | | | syntax. | +-------------------+----------------------------+ The default flags is DEFAULT. It is MySQL's BOOLEAN MODE compatible syntax. You can combine flags by separated by comma such as SYNTAX_QUERY,AL- LOW_LEADING_NOT. Here is an example SQL to use script syntax in Groonga: mysql> SET mroonga_boolean_mode_syntax_flags = "SYNTAX_SCRIPT"; mroonga_database_path_prefix TODO: mroonga_default_parser Deprecated since version 5.04: Use mroonga_default_tokenizer instead. The default parser of the full text search. The default value can be specified by --with-default-parser=PARSER configure argument, whose de- fault value is TokenBigram. Here is an example to use TokenBigramSplitSymbolAlphaDigit as a full- text search parser. It is used by body_index fulltext index. SET GLOBAL mroonga_default_parser=TokenBigramSplitSymbolAlphaDigit; CREATE TABLE diaries ( id INT PRIMARY KEY AUTO_INCREMENT, body TEXT, FULLTEXT INDEX body_index (body) ) DEFAULT CHARSET UTF8; mroonga_default_tokenizer Added in version 5.04. The default tokenizer of the full text search. The default value can be specified by --with-default-tokenizer=TOKENIZER configure argument, whose default value is TokenBigram. Here is an example to use TokenBigramSplitSymbolAlphaDigit as a full- text index tokenizer. It is used by body_index fulltext index. SET GLOBAL mroonga_default_tokenizer=TokenBigramSplitSymbolAlphaDigit; CREATE TABLE diaries ( id INT PRIMARY KEY AUTO_INCREMENT, body TEXT, FULLTEXT INDEX body_index (body) ) DEFAULT CHARSET UTF8; mroonga_default_wrapper_engine TODO: mroonga_dry_write Whether really write data to Groonga database or not. The default value is OFF that means data are really written to Groonga database. Usually we don't need to change the value of this variable. This variable is useful for benchmark because we can measure processing time MySQL and Mroonga. It doesn't include Groonga's processing time. Here is an example SQL to disable writing data to Groonga database: mysql> SHOW VARIABLES LIKE 'mroonga_dry_write'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | mroonga_dry_write | OFF | +-------------------+-------+ 1 row in set (0.00 sec) mysql> SET mroonga_dry_write = true; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'mroonga_dry_write'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | mroonga_dry_write | ON | +-------------------+-------+ 1 row in set (0.00 sec) mroonga_enable_optimization Whether enable optimization or not. The default value is ON that means optimization is enabled. Usually we don't need to change the value of this variable. This variable is useful for benchmark. Here is an example SQL to disable optimization: mysql> SHOW VARIABLES LIKE 'mroonga_enable_optimization'; +-----------------------------+-------+ | Variable_name | Value | +-----------------------------+-------+ | mroonga_enable_optimization | ON | +-----------------------------+-------+ 1 row in set (0.00 sec) mysql> SET mroonga_enable_optimization = false; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'mroonga_enable_optimization'; +-----------------------------+-------+ | Variable_name | Value | +-----------------------------+-------+ | mroonga_enable_optimization | OFF | +-----------------------------+-------+ 1 row in set (0.00 sec) mroonga_libgroonga_support_lz4 The status of libgroonga supports LZ4. Here is an example SQL to confirm the status of libgroonga supports LZ4: mysql> SHOW GLOBAL VARIABLES LIKE 'mroonga_libgroonga_support_lz4'; +--------------------------------+-------+ | Variable_name | Value | +--------------------------------+-------+ | mroonga_libgroonga_support_lz4 | ON | +--------------------------------+-------+ mroonga_libgroonga_support_zlib The status of libgroonga supports zlib. Here is an example SQL to confirm the status of libgroonga supports zlib: mysql> SHOW GLOBAL VARIABLES LIKE 'mroonga_libgroonga_support_zlib'; +---------------------------------+-------+ | Variable_name | Value | +---------------------------------+-------+ | mroonga_libgroonga_support_zlib | ON | +---------------------------------+-------+ mroonga_libgroonga_support_zstd The status of libgroonga supports Zstandard. Here is an example SQL to confirm the status of libgroonga supports Zs- tandard: mysql> SHOW GLOBAL VARIABLES LIKE 'mroonga_libgroonga_support_zstd'; +---------------------------------+-------+ | Variable_name | Value | +---------------------------------+-------+ | mroonga_libgroonga_support_zstd | ON | +---------------------------------+-------+ mroonga_libgroonga_support_mecab The status of libgroonga supports MeCab. Here is an example SQL to confirm the status of libgroonga supports MeCab. mysql> SHOW GLOBAL VARIABLES LIKE 'mroonga_libgroonga_support_mecab'; +----------------------------------+-------+ | Variable_name | Value | +----------------------------------+-------+ | mroonga_libgroonga_support_mecab | ON | +----------------------------------+-------+ mroonga_libgroonga_version The version string of the groonga library. Here is an example SQL to confirm the using groonga library version: mysql> SHOW VARIABLES LIKE 'mroonga_libgroonga_version'; +----------------------------+------------------+ | Variable_name | Value | +----------------------------+------------------+ | mroonga_libgroonga_version | 1.2.8-9-gbf05b82 | +----------------------------+------------------+ 1 row in set (0.00 sec) mroonga_lock_timeout TODO: mroonga_log_file The path of the log file of Mroonga. The default value is groonga.log. Here is an example transcript to change log file to /tmp/mroonga.log: mysql> SHOW VARIABLES LIKE 'mroonga_log_file'; +------------------+-------------+ | Variable_name | Value | +------------------+-------------+ | mroonga_log_file | groonga.log | +------------------+-------------+ 1 row in set (0.00 sec) mysql> SET GLOBAL mroonga_log_file = "/tmp/mroonga.log"; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'mroonga_log_file'; +------------------+------------------+ | Variable_name | Value | +------------------+------------------+ | mroonga_log_file | /tmp/mroonga.log | +------------------+------------------+ 1 row in set (0.00 sec) mroonga_log_level The output level of Mroonga log file. The default value is NOTICE. Here is the list of mroonga_log_level which you can use. +-----------+----------------------------+ | Log level | Description | +-----------+----------------------------+ | NONE | No logging output. | +-----------+----------------------------+ | EMERG | Logging emergency messages | | | such as database corrup- | | | tion. | +-----------+----------------------------+ | ALERT | Logging alert messages | | | such as internal error. | +-----------+----------------------------+ | CRIT | Logging critical message | | | such as deadlock. | +-----------+----------------------------+ | ERROR | Logging error messages | | | such as API error which | | | Mroonga use. | +-----------+----------------------------+ | WARNING | Logging warning messages | | | such as invalid argument. | +-----------+----------------------------+ | NOTICE | Logging notice messages | | | such as configuration or | | | status changed. | +-----------+----------------------------+ | INFO | Logging informative mes- | | | sages such as file system | | | operation. | +-----------+----------------------------+ | DEBUG | Logging debug messages. | | | | | | Recommend to use for | | | Mroonga developer or bug | | | reporter. | +-----------+----------------------------+ | DUMP | Logging dump messages. | +-----------+----------------------------+ Here is an example transcript to change log level to DEBUG that logs many messages for debugging: mysql> SHOW VARIABLES LIKE 'mroonga_log_level'; +-------------------+--------+ | Variable_name | Value | +-------------------+--------+ | mroonga_log_level | NOTICE | +-------------------+--------+ 1 row in set (0.00 sec) mysql> SET GLOBAL mroonga_log_level = "debug"; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'mroonga_log_level'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | mroonga_log_level | DEBUG | +-------------------+-------+ 1 row in set (0.00 sec) mroonga_query_log_file The path of the query log file of Mroonga. The default value is empty. If this value is empty, the query log is not stored to file. If this value is not empty, query log is stored to the specified file. Here is an example transcript to change query log file to /tmp/mroonga_query.log: mysql> SHOW VARIABLES LIKE 'mroonga_query_log_file'; +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | mroonga_query_log_file | | +------------------------+-------+ 1 row in set (0.00 sec) mysql> SET GLOBAL mroonga_query_log_file = "/tmp/mroonga.log"; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'mroonga_query_log_file'; +------------------------+------------------+ | Variable_name | Value | +------------------------+------------------+ | mroonga_query_log_file | /tmp/mroonga.log | +------------------------+------------------+ 1 row in set (0.00 sec) mroonga_match_escalation_threshold The threshold to determin whether match method is escalated. See search specification for Groonga about match method escalation. The default value is the same as Groonga's default value. It's 0 for the default installation. The dafault value can be configured in my.cnf or by SET GLOBAL mroonga_match_escalation_threshold = THRESHOLD;. Be- cause this variable's scope is both global and session. Here is an example to use -1 as a threshold to determin whether match method is escalated. -1 means that never escalated. SET GLOBAL mroonga_match_escalation_threshold = -1; Here is an another example to show behavior change by the variable value. CREATE TABLE diaries ( id INT PRIMARY KEY AUTO_INCREMENT, title TEXT, tags TEXT, FULLTEXT INDEX tags_index (tags) COMMENT 'tokenizer "TokenDelimit"' ) ENGINE=mroonga DEFAULT CHARSET=UTF8; -- Test data INSERT INTO diaries (title, tags) VALUES ("Hello groonga!", "groonga install"); INSERT INTO diaries (title, tags) VALUES ("Hello mroonga!", "mroonga install"); -- Matches all records that have "install" tag. SELECT * FROM diaries WHERE MATCH (tags) AGAINST ("install" IN BOOLEAN MODE); -- id title tags -- 1 Hello groonga! groonga install -- 2 Hello mroonga! mroonga install -- Matches no records by "gr" tag search because no "gr" tag is used. -- But matches a record that has "groonga" tag because search -- method is escalated and prefix search with "gr" is used. -- The default threshold is 0. It means that no records are matched then -- search method is escalated. SELECT * FROM diaries WHERE MATCH (tags) AGAINST ("gr" IN BOOLEAN MODE); -- id title tags -- 1 Hello groonga! groonga install -- Disables escalation. SET mroonga_match_escalation_threshold = -1; -- No records are matched. SELECT * FROM diaries WHERE MATCH (tags) AGAINST ("gr" IN BOOLEAN MODE); -- id title tags -- Enables escalation again. SET mroonga_match_escalation_threshold = 0; -- Matches a record by prefix search with "gr". SELECT * FROM diaries WHERE MATCH (tags) AGAINST ("gr" IN BOOLEAN MODE); -- id title tags -- 1 Hello groonga! groonga install mroonga_max_n_records_for_estimate Added in version 5.02. TODO: mroonga_enable_operations_recording Whether recording operations for recover is enabled or not. The de- fault value is OFF that means operations are not recorded to Groonga database. It needs to reopen the database with FLUSH TABLES in order to reflect the variable is changed. Here is an example SQL to disable operations recording: mysql> SET GLOBAL mroonga_enable_operations_recording = false; Query OK, 0 rows affected (0.00 sec) mysql> FLUSH TABLES; Query OK, 0 rows affected (0.00 sec) mysql> SHOW GLOBAL VARIABLES LIKE 'mroonga_enable_operations_recording'; +-------------------------------------+-------+ | Variable_name | Value | +-------------------------------------+-------+ | mroonga_enable_operations_recording | OFF | +-------------------------------------+-------+ mroonga_vector_column_delimiter The delimiter when outputting a vector column. The default value is a white space. Here is an example SQL to change the delimiter to a semicolon from a white space: mysql> SHOW VARIABLES LIKE 'mroonga_vector_column_delimiter'; +---------------------------------+-------+ | Variable_name | Value | +---------------------------------+-------+ | mroonga_vector_column_delimiter | | +---------------------------------+-------+ 1 row in set (0.00 sec) mysql> SET GLOBAL mroonga_vector_column_delimiter = ';'; Query OK, 0 rows affected (0.00 sec) mysql> SHOW GLOBAL VARIABLES LIKE 'mroonga_vector_column_delimiter'; +---------------------------------+-------+ | Variable_name | Value | +---------------------------------+-------+ | mroonga_vector_column_delimiter | ; | +---------------------------------+-------+ mroonga_version The version string of Mroonga. Here is an example SQL to confirm the running mroonga version: mysql> SHOW VARIABLES LIKE 'mroonga_version'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | mroonga_version | 1.10 | +-----------------+-------+ 1 row in set (0.00 sec) mroonga_condition_push_down_type Added in version 7.10. It controls how to enable condition push down support. The default value is ONE_FULL_TEXT_SEARCH. It means that condition push down is enabled only when WHERE clause has one MATCH AGAINST condition. Here are available values: +----------------------+----------------------------+ | Value | Description | +----------------------+----------------------------+ | NONE | Never use condition push | | | down. | +----------------------+----------------------------+ | ALL | Always use condition push | | | down. It's experimental | | | for now. | +----------------------+----------------------------+ | ONE_FULL_TEXT_SEARCH | Use condition push down | | | only when one MATCH | | | AGAINST condition exists | | | in WHERE. | | | | | | It's the default. | +----------------------+----------------------------+ Here is an example SQL to confirm the current value: mysql> SHOW VARIABLES LIKE 'mroonga_condition_push_down_type'; +----------------------------------+----------------------+ | Variable_name | Value | +----------------------------------+----------------------+ | mroonga_condition_push_down_type | ONE_FULL_TEXT_SEARCH | +----------------------------------+----------------------+ 1 row in set (0.00 sec) Status variables Here are the explanations of status variables that are introduced by Mroonga. Mroonga_count_skip This value is increased when 'fast line count feature' is used. You can use this value to check if the feature is working when you enable it. Here is an example how to check it: mysql> SHOW STATUS LIKE 'Mroonga_count_skip'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | Mroonga_count_skip | 0 | +--------------------+-------+ 1 row in set (0.00 sec) Mroonga_fast_order_limit This value is increased when 'fast ORDER BY LIMIT feature' is used. You can use this value to check if the feature is working when you en- able it. Here is an example how to check it: mysql> SHOW STATUS LIKE 'Mroonga_fast_order_limit'; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | Mroonga_fast_order_limit | 0 | +--------------------------+-------+ 1 row in set (0.00 sec) Mroonga_condition_push_down This value is increased when condition push down is used. You can use this value to check whether condition push down is used or not. Here is an example how to check it: mysql> SHOW STATUS LIKE 'Mroonga_condition_push_down'; +-----------------------------+-------+ | Variable_name | Value | +-----------------------------+-------+ | Mroonga_condition_push_down | 1 | +-----------------------------+-------+ 1 row in set (0.00 sec) Mroonga_n_pooling_contexts This value shows the number of pooling contexts for mroonga_command(). These contexts are reused from multiple mroonga_command() calls. So, it's not increased continually. Here is an example how to check it: mysql> SHOW STATUS LIKE 'Mroonga_n_pooling_contexts'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Mroonga_n_pooling_contexts | 0 | +----------------------------+-------+ 1 row in set (0.00 sec) Mroonga_memory_map_size Added in version 12.0.4. This value shows the currently memory mapped size in bytes. Here is an example how to check it: mysql> SHOW STATUS LIKE 'Mroonga_memory_map_size'; +-------------------------+----------+ | Variable_name | Value | +-------------------------+----------+ | Mroonga_memory_map_size | 83406848 | +-------------------------+----------+ 1 row in set (0.00 sec) UDF (User Defined Functions) Mroonga provides some useful functionality as UDF (User Defined Func- tions). mroonga_command() Added in version 3.02. Summary mroonga_command() UDF executes the given string as a Groonga command and returns result of the Groonga command. Groonga command will be faster than MySQL query. mroonga_command() is an UDF for advanced users. Normally, you don't need to use this UDF. Syntax mroonga_command() has only one required parameter: mroonga_command(command) command is a string value. It's a Groonga command to be executed. Usage TODO Parameters Required parameters There is one required parameter, command. command It specifies a Groonga command to be executed. Return value It returns an evaluated result of the given Groonga command as a string. See also Command in Groonga document. mroonga_escape() Added in version 3.08. Summary mroonga_escape UDF provides functionality to escape given string. It also accepts parameter what character should be escaped. Syntax mroonga_escape() has required parameter and optional parameter: mroonga_escape(string) mroonga_escape(string, special_characters) Usage Here is the example query which use special characters to be escaped: SELECT * FROM `symbols` WHERE MATCH(`content`) AGAINST(mroonga_escape("+hello_world()", "()") IN BOOLEAN MODE); Here is the example about special characters which is escaped: SELECT mroonga_escape("+-<>~*()\"\\\:"); -- \+\-\<\>\~\*\(\)\"\\\: Parameters Required parameters There is one required parameter, string. string It specifies text which you want to escape. Optional parameters There is one optional parameter, special_characters. special_characters It specifies characters to escape. The default value is +-<>~*()"\\:. Return value It returns escaped string. mroonga_highlight_html() Added in version 7.05. Summary mroonga_highlight_html() highlights the specified keywords in target text. It surrounds each keyword with <span class="keyword">...</span> and special characters in HTML such as < and > are escaped. You can use the result as is safely in HTML. Syntax mroonga_highlight_html() has required parameter and optional parameter: mroonga_highlight_html(text, query AS query) mroonga_highlight_html(text, query AS query, open_tag AS open_tag) mroonga_highlight_html(text, query AS query, close_tag AS close_tag) mroonga_highlight_html(text, query AS query, open_tag AS open_tag, close_tag AS close_tag) mroonga_highlight_html(text, keyword1, ..., keywordN) mroonga_highlight_html(text, keyword1, ..., keywordN, open_tag AS open_tag) mroonga_highlight_html(text, keyword1, ..., keywordN, open_tag AS close_tag) mroonga_highlight_html(text, keyword1, ..., keywordN, open_tag AS open_tag, close_tag AS close_tag) AS query is very important. You must specify it to extract keywords from query. open_tag and close_tag are optional. You can specify a tag for high- lighting. Usage Here is a sample to highlight keywords "mroonga" and "groonga" in tar- get text by query "mroonga OR groonga". You must specify AS query: SELECT mroonga_highlight_html('Mroonga is the Groonga based storage engine.', 'mroonga OR groonga' AS query) AS highlighted; Here is the result of the execution example: +--------------------------------------------------------------------------------------------------------+ | highlighted | +--------------------------------------------------------------------------------------------------------+ | <span class="keyword">Mroonga</span> is the <span class="keyword">Groonga</span> based storage engine. | +--------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) Here is a sample to highlight keywords "mroonga" and "groonga" in tar- get text by keywords "mroonga" and "groonga": SELECT mroonga_highlight_html('Mroonga is the Groonga based storage engine.', 'mroonga', 'groonga') AS highlighted; Here is the result of the execution example: +--------------------------------------------------------------------------------------------------------+ | highlighted | +--------------------------------------------------------------------------------------------------------+ | <span class="keyword">Mroonga</span> is the <span class="keyword">Groonga</span> based storage engine. | +--------------------------------------------------------------------------------------------------------+ Parameters Required parameters There is one required parameter. text The column name of string or string value to be highlighted. Optional parameters There are some optional parameters. query Specify query in Groonga's query syntax. You must specify AS query to extract keywords from query like the fol- lowing: SELECT mroonga_highlight_html('...', 'mroonga OR groonga' AS query); keyword Specify 0 or more keywords to be highlighted. open_tag Specify an open tag for highlighting. You must specify AS open_tag to specify this parameter like the follow- ing. SELECT mroonga_highlight_html('Mroonga is the Groonga based storage engine.', 'groonga', '<span class="my-class">' AS open_tag) AS highlighted; -- +----------------------------------------------------------------------------+ -- | highlighted | -- +----------------------------------------------------------------------------+ -- | Mroonga is the <span class="my-class">Groonga</span> based storage engine. | -- +----------------------------------------------------------------------------+ The default value is <span class="keyword">. close_tag Specify a close tag for highlighting. You must specify AS close_tag to specify this parameter like the fol- lowing. SELECT mroonga_highlight_html('Mroonga is the Groonga based storage engine.', 'groonga', '<mark>' AS open_tag, '</mark>' AS close_tag) AS highlighted; -- +-----------------------------------------------------------+ -- | highlighted | -- +-----------------------------------------------------------+ -- | Mroonga is the <mark>Groonga</mark> based storage engine. | -- +-----------------------------------------------------------+ The default value is </span>. Return value It returns highlighted HTML. If optional parameter is not given, it only escapes special characters in HTML such as <, > in text. mroonga_last_insert_grn_id() Summary mroonga_last_insert_grn_id() UDF provides functionality to get id of last inserted record. This id is unique in Groonga. Syntax Usage Parameters Required parameters Return value It returns id of last inserted record. mroonga_normalize() Added in version 5.11. Summary mroonga_normalize() UDF normalizes text by Groonga's normalizer. Syntax mroonga_normalize() has required parameter and optional parameter: mroonga_normalize(string) mroonga_normalize(string, normalizer_name) Usage Here is the example query which use Groonga's NormalizerAuto normalizer to be normalized: SELECT mroonga_normalize("ABCD"); abcd Here is the example query which use Groonga's NormalizerMySQLUnicode- CIExceptKanaCIKanaWithVoicedSoundMark normalizer to be normalized: SELECT mroonga_normalize("aB", "NormalizerMySQLUnicodeCIExceptKanaCIKanaWithVoicedSoundMark"); ABCD Parameters Required parameters There is one required parameter, string. string It specifies text which you want to normalize. Optional parameters There is one optional parameter, normalizer_name. normalizer_name It specifies Groonga's normalizer name to normalize. The default value is NormalizerAuto. Return value It returns normalized string. mroonga_snippet() Added in version 2.07. Summary mroonga_snippet() UDF provides functionality to get keyword in context. Syntax mroonga_snippet() has required parameter and optional parameter: mroonga_snippet(document, max_length, max_count, encoding, skip_leading_spaces, html_escape, snippet_prefix, snippet_suffix, word1, word1_prefix, word1_suffix, ..., [wordN, wordN_prefix, wordN_suffix]) Usage There is a case that you want to extract keyword and surrounding text as a search results. Snippet means 'keyword and surrounding text'. It is called 'Keyword in context'. mroonga_snippet() function provides the way to get snippet from search results. You can use mroonga_snippet() UDF in storage mode and wrapper mode. mroonga_snippet() function is included in Mroonga as a User-Defined Function (UDF), but if you have not yet register it in MySQL by CREATE FUNCTION, you need to invoke the following SQL for defining a function. mysql> CREATE FUNCTION mroonga_snippet RETURNS STRING SONAME 'ha_mroonga.so'; mroonga_snippet() function is useful for searching the text which con- tains keyword and associated one by using MATCH .. AGAINST syntax. Imagine searching the document which contains 'fulltext' as a keyword. Assume that some keyword such as 'MySQL' and 'search' are associated with 'fulltext'. mroonga_snippet() function meets above. Here is the schema definition for execution examples(storage mode): CREATE TABLE `snippet_test` ( `id` int(11) NOT NULL, `text` text, PRIMARY KEY (`id`), FULLTEXT KEY `text` (`text`) ) ENGINE=mroonga DEFAULT CHARSET=utf8 Here is the schema definition for execution examples(wrapper mode): CREATE TABLE `snippet_test` ( `id` int(11) NOT NULL, `text` text, PRIMARY KEY (`id`), FULLTEXT KEY `text` (`text`) ) ENGINE=mroonga COMMENT 'engine = "innodb"' DEFAULT CHARSET=utf8 Here is the sample data for execution examples: INSERt INTO snippet_test (id, text) VALUES (1, 'An open-source fulltext search engine and column store.'); INSERT INTO snippet_test (id, text) VALUES (2, 'An open-source storage engine for fast fulltext search with MySQL.'); INSERT INTO snippet_test (id, text) VALUES (3, 'Tritonn is a patched version of MySQL that supports better fulltext search function with Senna.'); Here is the results of execution examples: mysql> SELECT * FROM snippet_test; +----+-------------------------------------------------------------------------------------------------+ | id | text | +----+-------------------------------------------------------------------------------------------------+ | 1 | An open-source fulltext search engine and column store. | | 2 | An open-source storage engine for fast fulltext search with MySQL. | | 3 | Tritonn is a patched version of MySQL that supports better fulltext search function with Senna. | +----+-------------------------------------------------------------------------------------------------+ 3 rows in set (0.00 sec) mysql> SELECT id, text, mroonga_snippet(text, 8, 2, 'ascii_general_ci', 1, 1, '...', '...<br>', 'fulltext', '<span class="w1">', '</span>', 'MySQL', '<span class="w2">', '</span>', 'search', '<span calss="w3">', '</span>') FROM snippet_test WHERE MATCH(text) AGAINST ('+fulltext' IN BOOLEAN MODE); +----+-------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | id | text | mroonga_snippet(text, 8, 2, 'ascii_general_ci', 1, 1, '...', '...<br>', 'fulltext', '<span class="w1">', '</span>', 'MySQL', '<span class="w2">', '</span>', 'search', '<span calss="w3">', '</span>') | +----+-------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | An open-source fulltext search engine and column store. | ...<span class="w1">fulltext</span>...<br>... <span calss="w3">search</span> ...<br> | | 2 | An open-source storage engine for fast fulltext search with MySQL. | ...<span class="w1">fulltext</span>...<br>... <span calss="w3">search</span> ...<br> | | 3 | Tritonn is a patched version of MySQL that supports better fulltext search function with Senna. | ...f <span class="w2">MySQL</span> ...<br>...<span class="w1">fulltext</span>...<br> | +----+-------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3 rows in set (0.00 sec) The keyword 'fulltext' and associated keyword 'MySQL' and 'search' has been extracted. Parameters Required parameters There are many required parameters. document The column name or string value is required. max_length The max length of snippet (bytes) is required. max_count The max elements of snippets (N word) is required. encoding The encoding of document is required. You can specify the value of en- coding such as 'ascii_general_ci', 'cp932_japanese_ci', 'eucjpms_japan- ese_ci' and so on. skip_leading_spaces Specify whether skip leading spaces or not. Specify the value 1 for skipping leading spaces, 0 for not. html_escape HTML escape is enabled or not. Specify the value 1 for enabling HTML escape, 0 for not. prefix The start text of snippet. suffix The end text of snippet. Optional parameters There is one optional group parameter, wordN related prefix and suffix. wordN Specify any word. wordN_prefix It is the start text of wordN. wordN_suffix It is the end text of wordN. Return value It returns snippet string. mroonga_snippet_html() Added in version 5.09. Summary mroonga_snippet_html() UDF provides functionality to get highlighted keyword in context. Syntax mroonga_snippet_html() has required parameter and optional parameter: mroonga_snippet_html(document, key_word) Usage Parameters Required parameters There are many required parameters. Optional parameters Return value It returns snippet string. Limitations There are some limitations in Mroonga storage engine. Limitations of table A table has the following limitations. This limitation is derived from Groonga. • The maximum one key size: 4KiB • The maximum total size of keys: 4GiB • The maximum number of records: • No primary key table: 1,073,741,815 (2 30 - 9) • PRIMARY KEY or PRIMARY KEY USING BTREE table: 1,073,741,823 (2 30 - 1) • PRIMARY KEY USING HASH table: 536,870,912 (2 29) Keep in mind that these limitations may vary depending on conditions. Limitations of indexing A full-text index has the following limitations. This limitation is derived from groonga and applied to each table. • The maximum number of distinct terms: 268,435,455 (more than 268 mil- lion) • The maximum index size: 256GiB Keep in mind that these limitations may vary depending on conditions. Confirm by SHOW TABLE STATUS or SHOW INDEX FROM ... whether your table data matches to this limitations. Limitations about the value of columns There is a limitation about the value of column in storage mode. Mroonga storage engine executes automatic conversion against the value NULL. For example, if the value NULL is used in DATE or DATETIME columns, Mroonga storage engine automatically converts 0 into 1 as the value of month or date. Thus, the value 0 is treated as the 1st month (January) of the year or the 1st date of the month. And more, the value NULL is treated as the value of UNIX time 0 (1970-01-01 00:00:00). This kind of automatic conversion is not restricted to only DATE or DATETIME types. The value NULL is converted into the default value of columns. In most cases, it will be converted into empty string for column which belongs to type of string, 0 for column which belongs to type of numeric. Here is an example to show behavior described above. CREATE TABLE date_limitation ( id INT PRIMARY KEY AUTO_INCREMENT, input varchar(32) DEFAULT NULL, date DATE DEFAULT NULL ) ENGINE=mroonga DEFAULT CHARSET=UTF8; CREATE TABLE datetime_limitation ( id INT PRIMARY KEY AUTO_INCREMENT, input varchar(32) DEFAULT NULL, datetime DATETIME DEFAULT NULL ) ENGINE=mroonga DEFAULT CHARSET=UTF8; -- Test data for date_limitation INSERT INTO date_limitation (input) VALUES ("NULL"); INSERT INTO date_limitation (input, date) VALUES ("1970-00-00", "1970-00-00"); -- Test data for datetime_limitation INSERT INTO datetime_limitation (input) VALUES ("NULL"); INSERT INTO datetime_limitation (input, datetime) VALUES ("1970-00-00 00:00:00", "1970-00-00 00:00:00"); Here is the results of execution example: mysql> select * from date_limitation; +----+------------+------------+ | id | input | date | +----+------------+------------+ | 1 | NULL | 1970-01-01 | | 2 | 1970-00-00 | 1970-01-01 | +----+------------+------------+ 2 rows in set (0.00 sec) mysql> select * from datetime_limitation; +----+---------------------+---------------------+ | id | input | datetime | +----+---------------------+---------------------+ | 1 | NULL | 1970-01-01 00:00:00 | | 2 | 1970-00-00 00:00:00 | 1970-01-01 00:00:00 | +----+---------------------+---------------------+ 2 rows in set (0.00 sec) Limitations of column size A column has the following limitation. • The maximum stored data size of a column: 256GiB Full text search Boolean mode Summary Mroonga can perform boolean full text searches using the IN BOOLEAN MODE modifier for MATCH AGAINST: SELECT ... WHERE MATCH(column) AGAINST ('...' IN BOOLEAN MODE); Normally, IN BOOLEAN MODE is suitable rather than the default IN NAT- URAL LANGUAGE MODE. Because IN BOOLEAN MODE is similar to query in Web search engine. Most people familiar with query in Web search engine. You can use qualifiers which MySQL support and Mroonga original pragmas in boolean full text search query. These qualifiers and pragmas can change the relative rank of search re- sults. In the case of a search string not using neither a qualifier nor a pragma, the search results that contain the search string will be rated higher. Usage Here are schema and data to show examples: CREATE TABLE books ( `id` INTEGER AUTO_INCREMENT, `title` text, PRIMARY KEY(`id`), FULLTEXT INDEX title_index (title) ) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4; INSERT INTO books (title) VALUES ('Professional MySQL'); INSERT INTO books (title) VALUES ('MySQL for Professional'); INSERT INTO books (title) VALUES ('Mroonga = MySQL + Groonga'); Qualifier Here are supported qualifiers. KEYWORD1 KEYWORD2 No operator between keywords such as KEYWORD1 KEYWORD2 indicates that one of keywords must be present in each row that is returned. Mroonga for query means that Mroonga or for must be present: SELECT title FROM books WHERE MATCH(title) AGAINST('Mroonga for' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | Mroonga = MySQL + Groonga | -- | MySQL for Professional | -- +---------------------------+ KEYWORD1 OR KEYWORD2 OR (must be uppercase) indicates that left hand side keyword or right hand side keyword must be present in each row that is returned. Mroonga OR for query means that Mroonga or for must be present: SELECT title FROM books WHERE MATCH(title) AGAINST('Mroonga OR for' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | Mroonga = MySQL + Groonga | -- | MySQL for Professional | -- +---------------------------+ OR is the default operator. You can omit it. Both Mroonga OR for and Mroonga for return the same result. +KEYWORD A leading plus sign indicates that this word must be present in each row that is returned. +MySQL +Mroonga query means that both MySQL and Mroonga must be present: SELECT title FROM books WHERE MATCH(title) AGAINST('+MySQL +Groonga' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | Mroonga = MySQL + Groonga | -- +---------------------------+ -KEYWORD A leading minus sign indicates that this word must not be present in any of the rows that are returned. +MySQL -Mroonga query means that MySQL must be present but Mroonga must not be present: SELECT title FROM books WHERE MATCH(title) AGAINST('+MySQL -Mroonga' IN BOOLEAN MODE); -- +------------------------+ -- | title | -- +------------------------+ -- | Professional MySQL | -- | MySQL for Professional | -- +------------------------+ PREFIX* A following asterisk indicates that all words starting with this word must be present in any of the rows that are returned. +M* query means that words starting M (MySQL and Mroonga in this case) must be present: SELECT title FROM books WHERE MATCH(title) AGAINST('+M*' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | Mroonga = MySQL + Groonga | -- | Professional MySQL | -- | MySQL for Professional | -- +---------------------------+ NOTE: To be precise, "word" may not be "word" you think. "word" in this context is "token". "token" may not be word. For example, tokens in "It's" are "It", "'" and "s". You can confirm token by mroonga_command() and tokenize: SELECT mroonga_command('tokenize TokenBigram "It''s" NormalizerMySQLGeneralCI'); -- +--------------------------------------------------------------------------+ -- | mroonga_command('tokenize TokenBigram "It''s" NormalizerMySQLGeneralCI') | -- +--------------------------------------------------------------------------+ -- | [ | -- | { | -- | "value":"IT", | -- | "position":0, | -- | "force_prefix":false | -- | }, | -- | { | -- | "value":"'", | -- | "position":1, | -- | "force_prefix":false | -- | }, | -- | { | -- | "value":"S", | -- | "position":2, | -- | "force_prefix":false | -- | } | -- | ] | -- +--------------------------------------------------------------------------+ JSON value in the above result is formatted by hand. "PHRASE" Quoting phrase by double quote (") indicates that the phrase must be present in any of the rows that are returned. +"Professional MySQL" query means that Professional MySQL phrase must be present. The query doesn't match to MySQL for Profession. MySQL for Profession includes both MySQL and Professional words but doesn't in- clude Professional MySQL phrase: SELECT title FROM books WHERE MATCH(title) AGAINST('+"Professional MySQL"' IN BOOLEAN MODE); -- +--------------------+ -- | title | -- +--------------------+ -- | Professional MySQL | -- +--------------------+ (SUBEXPRESSION...) Parentheses groups expressions. +(Groonga OR Mroonga) +MySQL query means the following: • Groonga or Mroonga must be present. • MySQL must be present. Here is the result of the query: SELECT title FROM books WHERE MATCH(title) AGAINST('+(Groonga OR Mroonga) +MySQL' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | Mroonga = MySQL + Groonga | -- +---------------------------+ Pragma Pragma is metadata for query. You can change how to parse query by specifying pragma. You can embed pragma at the head of query for specifying how to exe- cute. Pragma must exist in the beginning of a query. Don't put a blank into a head of the query. Pragma starts with *: SELECT MATCH AGAINST('*PRAGMA ...' IN BOOLEAN MODE); You can specify multiple pragmas: SELECT MATCH AGAINST('*PRAGMA1PRAGMA2 ...' IN BOOLEAN MODE); Here are available pragmas. D pragma D pragma indicates the default operator. It's used when an individual operator is omitted. Here is the D pragma syntax. You can choose one of OR, + or - as ${OP- ERATOR}: *D${OPERATOR} DOR DOR means that "or" is used as the default operator. This is the default. Here is an example to use DOR. '*DOR for Mroonga' IN BOOLEAN MODE re- turns records that includes for or Mroonga: SELECT title FROM books WHERE MATCH (title) AGAINST('*DOR for Mroonga' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | MySQL for Professional | -- | Mroonga = MySQL + Groonga | -- +---------------------------+ D+ D+ means that "and" is used as the default operator. It's similar to query in Web search engine. Here is an example to use D+. '*D+ MySQL Mroonga' IN BOOLEAN MODE re- turns records that includes MySQL and Mroonga: SELECT title FROM books WHERE MATCH (title) AGAINST('*D+ MySQL Mroonga' IN BOOLEAN MODE); -- +---------------------------+ -- | title | -- +---------------------------+ -- | Mroonga = MySQL + Groonga | -- +---------------------------+ D- D- means that "not" is used as the default operator. Here is an example to use D-. '*D- MySQL Mroonga' IN BOOLEAN MODE re- turns records that includes MySQL but doesn't include Mroonga: SELECT title FROM books WHERE MATCH (title) AGAINST('*D- MySQL Mroonga' IN BOOLEAN MODE); -- +------------------------+ -- | title | -- +------------------------+ -- | Professional MySQL | -- | MySQL for Professional | -- +---------------------------+ W pragma W pragma indicates target section and its weight for multiple column index. You can specify different weight for each section. The default weight is 1. 1 means that no weight. Here is the W pragma syntax. ${SECTION} is a number that is begun not from 0 but from 1. ${WEIGHT} is omitable: *W[${SECTION1}[:${WEIGHT1}]][,${SECTION2}[:${WEIGHT2}]][,...] Here are schema and data to show examples. You need to create a multi- ple column index to use W pragma: CREATE TABLE memos ( `id` INTEGER AUTO_INCREMENT, `title` text, `content` text, PRIMARY KEY(`id`), FULLTEXT INDEX text_index (title, content) ) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4; INSERT INTO memos (title, content) VALUES ( 'MySQL', 'MySQL is a RDBMS.' ); INSERT INTO memos (title, content) VALUES ( 'Groonga', 'Groonga is a full text search engine.' ); INSERT INTO memos (title, content) VALUES ( 'Mroonga', 'Mroonga is a storage engine for MySQL based on Groonga.' ); Here is an example to show how to use weight. title column has 10 weight and content columns has 1 weight. It means that keyword in title column is 10 times important than keyword in content column: SELECT title, content, MATCH (title, content) AGAINST('*W1:10,2:1 +Groonga' IN BOOLEAN MODE) AS score FROM memos; -- +---------+--------------------------------------------------------+-------+ -- | title | content | score | -- +---------+--------------------------------------------------------+-------+ -- | MySQL | MySQL is a RDBMS. | 0 | -- | Groonga | Groonga is a full text search engine. | 11 | -- | Mroonga | Mroonga is a storage engine for MySQL based on Groonga | 1 | -- +---------+--------------------------------------------------------+-------+ The score of the first record is 0. Because it doesn't have any Groonga in both title column and content column. The score of the second record is 11. Because it has Groonga in both title column and content column. Groonga in title column has score 10. Groonga in content column has score 1. 11 is sum of them. The score of the third record is 1. Because it has Groonga in only con- tent column. `Groonga in content column has score 1. So the score of the record is 1. S pragma S pragma indicates syntax of the query. Here is a syntax of S pragma: *S${SYNTAX} Here is a list of available syntax: • S: Script syntax *SS You can use script syntax by *SS pragma. You can use full Groonga search features in script syntax. Here are schema and data to show example of script syntax usage: CREATE TABLE comments ( `content` text, FULLTEXT INDEX content_index (content) ) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4; INSERT INTO comments VALUES ( 'A student started to use Mroonga storage engine. It is very fast!' ); INSERT INTO comments VALUES ( 'Another student also started to use Mroonga storage engine. It is very fast!' ); Here is an example to use near search by script syntax: SELECT content, MATCH (content) AGAINST('*SS content *N "student fast"' IN BOOLEAN MODE) AS score FROM comments; -- +------------------------------------------------------------------------------+-------+ -- | content | score | -- +------------------------------------------------------------------------------+-------+ -- | A student started to use Mroonga storage engine. It is very fast! | 1 | -- | Another student also started to use Mroonga storage engine. It is very fast! | 0 | -- +------------------------------------------------------------------------------+-------+ Near search matches only when there are 10 or less words between speci- fied words (student and fast in this case). So student started ...(8 words)... very fast is matched but student also started ...(8 words)... very fast isn't matched. You can also use other advanced features. Search and Scoring in Mroonga Mroonga Scoring in natural language mode The search score of Mroonga in natural language mode is a similarity score between query and document. Mroonga's scoring algorithm is as follows: 1. Splitting a query into tokens. 2. Removing not matched tokens. 3. Calculating weight per token. 4. Getting the top N weight tokens. 5. Summing up the weight per token which exists in a document, and is one of the top N weight tokens. The total weight is the simi- larity score for the document and the query. We are showing by example. The first thing we create a table and in- sert data as follows: SET NAMES UTF8; CREATE TABLE diaries ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, content TEXT, FULLTEXT INDEX(content) ) ENGINE mroonga DEFAULT CHARSET UTF8; INSERT INTO diaries (content) VALUES("It'll be fine tomorrow as well."); INSERT INTO diaries (content) VALUES("It'll rain tomorrow."); INSERT INTO diaries (content) VALUES("It's fine today. It'll be fine tomorrow as well."); INSERT INTO diaries (content) VALUES("It's fine today. But it'll rain tomorrow."); We use a query "fine today" to search. The search result is as fol- lows: mysql> SELECT *, MATCH (content) AGAINST ("fine today") AS score -> FROM diaries -> WHERE MATCH (content) AGAINST ("fine today") -> ORDER BY MATCH (content) AGAINST ("fine today") DESC; +----+--------------------------------------------------+--------+ | id | content | score | +----+--------------------------------------------------+--------+ | 3 | It's fine today. It'll be fine tomorrow as well. | 131073 | | 4 | It's fine today. But it'll rain tomorrow. | 131073 | +----+--------------------------------------------------+--------+ 2 rows in set (0.01 sec) Now, let us explain how to make the result score "131073". Splitting a query into tokens The default tokenizer splits up the query "fine today" into two tokens as follows: • fine • today Removing not matched tokens We have nothing to do in this case because the tokens exist in docu- ments. • fine: include in document id=1,3,4 • today: include in document id=3,4 Calculating weight per token • fine: 116508(= 1048576 / 9) • today: 131072(= 1048576 / 8) 1048576(= 2 ** 20) denotes the number of tokens in all documents. We should get the number from the table, but as a matter of fact, the fixed value is used for simplicity. In the fraction 1048576 / 8, the denominator "8" indicates that the number of document which contains token "today". In the fraction 1048576 / 9, the denominator "9" means the number of document which contains token "fine". The denominator "8" and "9" are approximate number, strictly speaking, "2" is the correct answer for the number of document that contains "to- day", "3" is the correct answer for the number of document that con- tains "fine". You can check the approximate number of token by using the following query. SELECT mroonga_command("select diaries-content --query '_key:fine OR _key:today' --output_columns _key, index --limit -1") AS groonga_response; The retrieval result of the above query is as follows: [[[2],[["_key","ShortText"],["index","diaries"]],["FINE",9],["TODAY",8]]] Getting the top N weight tokens The formula for the N is "the number of occurrence of token in document / 8 + 1" In this case, N = 2 / 8 + 1 1 The sorted tokens are as follows: • today: 131072(= 1048576 / 8) • fine: 116508(= 1048576 / 9) N = 1, then we get "today: 131072(= 1048576 / 8)". Summing up the weight per token which exists in a document, and is one of the top N weight tokens • today: include in document id=3,4 Finally document id 3,4 are hit, the similarity score between query and document(id=3) is 131072 + 1 = 131073 ("1" is the number of occurrence of token "today" in document). The similarity score between query and document(id=4) is the same as the score between query and document(id=3). FAQ There are some frequently asked questions for Mroonga. This section de- scribes about them. How to avoid "mmap Cannot allocate memory" error? Short answer Increase vm.max_map_count Linux kernel parameter value. Long answer There is a case following mmap error in log file: 2013-06-04 08:19:34.835218|A|4e86e700|mmap(4194304,551,432017408)=Cannot allocate memory <13036498944> Note that <13036498944> means total size of mmap (almost 12GB) in this case. So you need to confirm following point of views. • Are there enough free memory? • Are maximum number of mappings exceeded? To check there are enough free memory, you can use vmstat command. To check whether maximum number of mappings are exceeded, you can in- vestigate the value of vm.max_map_count. If this issue is fixed by modifying the value of vm.max_map_count, it's exactly the reason. As Groonga allocates memory chunks each 256KB, you can estimate the size of database you can handle by following formula: (database size) = vm.max_map_count * (memory chunks) If you want to handle over 16GB Groonga database, you must specify at least 65536 as the value of vm.max_map_count: database size (16GB) = vm.max_map_count (65536) * memory chunks (256KB) You can modify vm.max_map_count temporary by sudo sysctl -w vm.max_map_count=65536. Then save the configuration value to /etc/sysctl.conf or /etc/sysctl.d/*.conf. COMMUNITY There are some places for sharing Groonga and Mroonga information. We welcome you to join our community. Mailing List There are mailing lists for discussion about Groonga and Mroonga. For English speakers groonga-talk@lists.sourceforge.net For Japanese speakers groonga-dev@lists.osdn.me Chat room There are chat rooms for discussion about Groonga and Mroonga. For English speakers groonga/en chat room on Gitter For Japanese speakers groonga/ja chat room on Gitter Twitter @groonga tweets Groonga and Mroonga related information. Please follow the account to get the latest Groonga and Mroonga related information! Facebook Mroonga page on Facebook shares Mroonga related information. Please like the page to get the latest Mroonga related information! HOW TO CONTRIBUTE TO MROONGA We welcome your contributions to the Mroonga Project. There are many ways to contribute, such as using Mroonga, introduction to others, etc. For example, if you find a bug when using Mroonga, you are welcome to report the bug. Coding and documentation are also welcome for Mroonga and its related projects. As a user: If you are interested in Mroonga, please read this document and try it. As a spokesman: Please introduce Mroonga to your friends and colleagues. As a developer: Bug report, development and documentation This section describes the details. How to report a bug There are some ways to report a bug. You can use either way It makes no difference to us. Submit a bug to the issue tracker Mroonga project uses GitHub issue tracker. You can use English or Japanese to report a bug. Report a bug with the other ways Mroonga project has Community for discussing about Mroonga. Please send an Email that describes a bug. How to collect data for reporting a reproducible bug There may be a case that Mroonga doesn't work as expected. It may be a crash bug or SQL returns unexpected search results. To solve such a is- sues with you, We need a reproducible bug report. For a reproducible bug report, it is better to report with the follow- ing information as much as you can because it is useful to investigate the reason which causes the bug. +----------------------------+-----------------------------------+ | Name | Description | +----------------------------+-----------------------------------+ | Environment information | Installed packages infor- | | | mation to detect what ver- | | | sion is used. | | | | | | rpm: | | | | | | rpm -qa > INSTALLED.txt | | | | | | deb: | | | | | | dpkg -l > INSTALLED.txt | | | | | | On Windows, show us | | | what zip package is | | | used. | +----------------------------+-----------------------------------+ | Executing query which | It shows what you did. | | triggers the issue | | +----------------------------+-----------------------------------+ | Sample schema and data | It is recommended to provide a | | | minimum schema and sample data to | | | reproduce the issue. It helps to | | | solve the issue sooner. | +----------------------------+-----------------------------------+ | MySQL error log | It may include backtrace informa- | | | tion why crashed unexpectedly. | | | It may exists on | | | /var/log/mysqld.log for example. | | | (It depends on log-error configu- | | | ration) | +----------------------------+-----------------------------------+ | groonga.log | It may include low level error | | | information. It may exists on | | | /var/lib/mysql/groonga.log for | | | example. (It depends on | | | mroonga_log_file configuration) | +----------------------------+-----------------------------------+ There is a case that it is not enough to reproduce a bug with above in- formation. For such a case, additional information is required. +----------------------------+----------------------------+ | Name | Description | +----------------------------+----------------------------+ | MySQL dump | It may be required to con- | | | struct same database from | | | scratch. It may be gener- | | | ated by mysqldump -u USER | | | -pPASSWORD --opt | | | --flush-logs --sin- | | | gle-transaction --mas- | | | ter-data=2 --default-char- | | | acter-set=utf8 --hex-blob | | | --databases DATABASE > | | | dump.sql and so on. | +----------------------------+----------------------------+ | MySQL binlog | If dump data is enough to | | | reproduce the issue, it is | | | not required. It may ex- | | | ists on | | | /var/lib/mysql/mysql-bin.* | | | for example. (It depends | | | on log-bin configuration) | +----------------------------+----------------------------+ | MySQL data directory (Of- | If dump data and binlog | | fline backup) | pair is lost, It may be | | | required. It may exists | | | on /var/lib/mysql/ for ex- | | | ample. (It depends on | | | datadir configuration) | +----------------------------+----------------------------+ How to contribute in documentation topics We use Sphinx for documentation tool. pull request doc/source.rst GitHub • The things you must do at first • The things you need to do every tasks • The things you need to do every files The things you must do at first Git configuration gitgit $ git config --global user.name "Your Name" $ git config --global user.email "Email address" Fork on GitHub First, create GitHub account. If your GitHub account is ready, login to GitHub and access following URL. • Fork the Mroonga repository Fork fork Initial configuration for working repository Clone Mroonga repository to working directory. Don't forget to do "Git configuration". $ git clone git@github.com:(YOUR_GITHUB_ACCOUNT)/mroonga.git $ cd mroonga $ git remote add upstream git@github.com:mroonga/mroonga Prerequisites Before generating the Mroonga documentation, we need to build Mroonga. Please refer to Others for the instructions on how to build Mroonga. Required software Generating Mroonga documentation needs the followings. • Sphinx • gettext gem We use Sphinx for documentation tool and use gettext gem for localiza- tion. You can install both tools using the following commands. $ pip install -r doc/requirements.txt $ (cd doc && bundle install) Initial configuration for building documentation Execute following commands to prepare for generating Mroonga documenta- tion. $ cmake \ -S . \ -B ../mroonga.doc \ --preset=doc \ -DMYSQL_SOURCE_DIR=(MySQL_SOURCE_DIRECTORY) \ -DMYSQL_BUILD_DIR=(MySQL_BUILD_DIRECTORY) \ -DMYSQL_CONFIG=(MySQL_CONFIG) Next step is "The things you need to do every tasks". The things you need to do every tasks Follow the upstream Mroonga $ git fetch --all $ git checkout main $ git rebase upstream/main The things you need to do every files The characteristics of Mroonga doc/source/- .rstdoc/source/characteristic.rst Create working branch Create a working branch. Use meaningful branch name. $ git checkout -b use-capitalized-notation-characteristic Editing text Fix typos, styles or write a new document for Mroonga. Confirm generated document Execute following command to generate HTML files that reflect your changes. $ cmake --build ../mroonga.doc Open the generated file in your Web browser to preview your changes are reflected. $ open ../mroonga.doc/doc/en/html/characteristic.html Commit HTML $ cd ${clone} $ git add doc/source/characteristic.rst $ git commit doc: use "Mroonga" notation Push and pull request Publish your changes to your own GitHub repository. $ git push -u origin use-capitalized-notation-characteristic Note that use-capitalized-notation-characteristic is already created branch in advance. https://github.com/(GitHub)/mroonga @use-capitalized-notation-charac- teristic@ pull requestUIpull request pull requestOK DEVELOPER'S GUIDE How to debug Building for debugging When you build software for debugging, you can get more information like symbol resolutions in GDB. So we build both MySQL and Mroonga for debugging in development. NOTE: If you build one of them for debugging, the size of structures etc. might be different, and you might not be able to load Mroonga, or assertions don't work in running. How to build MySQL for debugging As you can see in MySQL :: 2.9.4 Installing MySQL Using a Standard Source Distribution, you can build MySQL for debugging by passing -DWITH_DEBUG=ON option in CMake options. The procedure from download to build is the following: $ mkdir -p ~/work/ $ cd ~/work/ $ wget https://cdn.mysql.com/Downloads/MySQL-8.4/mysql-8.4.2.tar.gz $ tar xf mysql-8.4.2.tar.gz $ cmake \ -Smysql-8.4.2 \ -Bmysql-8.4.2.build \ -GNinja \ -DCMAKE_BUILD_TYPE=Debug \ -DCMAKE_INSTALL_PREFIX=/tmp/local \ -DWITH_DEBUG=ON $ cmake --build mysql-8.4.2.build How to build MariaDB for debugging If you want to use MariaDB instead of MySQL, you can use -DWITH_DE- BUG=ON too. Note that you need to remove storage/mroonga/ (Mroonga bundled in Mari- aDB) before you build MariaDB. The procedure from download to build is the following: $ mkdir -p ~/work/ $ cd ~/work/ $ wget https://downloads.mariadb.org/rest-api/mariadb/11.4.3/mariadb-11.4.3.tar.gz $ tar xf mariadb-11.4.3.tar.gz $ rm -rf mariadb-11.4.3/storage/mroonga $ cmake \ -Smariadb-11.4.3 \ -Bmariadb-11.4.3.build \ -GNinja \ -DCMAKE_BUILD_TYPE=Debug \ -DCMAKE_INSTALL_PREFIX=/tmp/local \ -DWITH_DEBUG=ON $ cmake --build mariadb-11.4.3.build How tom build Mroonga for debugging You can build Mroonga for debugging by passing --preset=debug in CMake options. NOTE: In order to build Mroonga, you need to install required tools and libraries beforehand. See Others for the details of dependencies. The procedure from cloning repository to build is the following: $ cd ~/work/ $ git clone git@github.com:mroonga/mroonga.git $ cmake \ -Smroonga \ -Bmroonga.mysql-8.4 \ --preset=debug \ -DCMAKE_INSTALL_PREFIX=/tmp/local \ -DMYSQL_BUILD_DIR=$HOME/work/mysql-8.4.2.build \ -DMYSQL_CONFIG=$HOME/work/mysql-8.4.2.build/scripts/mysql_config \ -DMYSQL_SOURCE_DIR=$HOME/work/mysql-8.4.2 $ cmake --build mroonga.mysql-8.4 When you successfully build both, please invoke tests like the follow- ing. If you get [pass] for all tests, you succeeded to build for debug- ging: $ cd mroonga.mysql-8.4 $ ../mroonga/test/run-sql-test.sh More about run-sql-test.sh run-sql-test.sh is our friend for debugging. Here we show some examples of its usage. Run the specified test only When you invoke run-sql-test.sh without any option, all tests under mysql-test/mroonga will be invoked. So if you want to run certain tests only, you can specify the test: $ ../mroonga/test/run-sql-test.sh ../mroonga/mysql-test/mroonga/storage/t/truncate.test See the trace When you run tests by adding --debug option like the following, func- tion calls information is recorded: $ ../mroonga/test/run-sql-test.sh --debug ../mroonga/mysql-test/mroonga/storage/t/truncate.test This information is stored in ${MySQL's build direc- tory}/mysql-test/var/log/mysqld.1.trace. When you add a new method, it would be a good idea to put MRN_DBUG_EN- TER_METHOD()/DBUG_RETURN() in the beginning/ending of method and record its calls. Invoking GDB By adding --manual-gdb option, you can debug with GDB when you run tests: $ ../mroonga/test/run-sql-test.sh --manual-gdb ../mroonga/mysql-test/mroonga/storage/t/truncate.test You need to run gdb in other terminal for this. The command line to run gdb will be showed by run-sql-test.sh. Release procedure Requirements Here is the requirements about release procedure. • Debian GNU/Linux (sid) • Use zsh as user shell Use the following working directories. • MROONGA_SOURCE_DIR=$HOME/work/mroonga • MROONGA_BUILD_DIR=$HOME/work/build-dir/mroonga • MROONGA_ORG_DIR=$HOME/work/mroonga.org • MARIADB_SOURCE_DIR=$HOME/work/mariadb • MARIADB_BUILD_DIR=$HOME/work/build-dir/mariadb • GROONGA_SOURCE_DIR=$HOME/work/groonga • GROONGA_BUILD_DIR=$HOME/work/build-dir/groonga • INSTALL_DIR=/tmp/local Setup build environment Install the following packages: % sudo apt-get install -V ruby mecab libmecab-dev gnupg2 dh-autoreconf python-sphinx bison % pip3 install -U sphinx myst-parser linkify-it-py Describe the changes Summarize recent changes since the previous release into doc/source/news/xx.md. Should be included • The changes which affect to users • The changes which broke compatibility Shoud not be included • The changes which doesn't affect to users (Internal source code changes or refactoring) Execute the following command to create HTML for news: % cmake -S ${MARIADB_SOURCE_DIR} -B ${MARIADB_BUILD_DIR} -GNinja -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR} -DPLUGIN_CASSANDRA=NO % cmake --build ${MARIADB_BUILD_DIR} % cmake --install ${MARIADB_BUILD_DIR} % cmake -S ${GROONGA_SOURCE_DIRECTORY} -B ${GROONGA_BUILD_DIR} --preset=release-default --fresh -DCMAKE_INSTALL_PREFIX="${INSTALL_DIR}" % cmake --build ${GROONGA_BUILD_DIR} % cmake --install ${GROONGA_BUILD_DIR} % PKG_CONFIG_PATH=${INSTALL_DIR}/lib/pkgconfig cmake -S ${MROONGA_SOURCE_DIRECTORY} -B ${MROONGA_BUILD_DIR} --fresh --preset=doc -DCMAKE_INSTALL_PREFIX="${INSTALL_DIR}" -DMYSQL_SOURCE_DIR=~${MARIADB_SOURCE_DIR} -DMYSQL_BUILD_DIR=${MARIADB_BUILD_DIR} -DMYSQL_CONFIG=${INSTALL_DIR}/bin/mariadb_config % cmake --build ${MROONGA_BUILD_DIR} Check whether you can upload packages Check whether you can login to packages.groonga.org as packages user. You can check with the following command whether you can login: % ssh packages@packages.groonga.org If you can't login to packages.groonga.org, you must be registered ssh public key. Execute make update-latest-release Execute rake release:version:update command with OLD_RELEASE_DATE, NEW_RELEASE_DATE. When 14.07 release, we executed the following command: % rake release:version:update OLD_RELEASE=14.04 OLD_RELEASE_DATE=2024-06-12 NEW_RELEASE_DATE=2024-09-06 This command updates the version of spec file or debian/changelog en- try. Confirm the results of each test We confirm the results of all the below tests and build before setting the tag to Mroonga. Because if we will find problems in Mroonga after setting the tag to it, we must release it again. • GitHub Actions • Launchpad How to build packages for Ubuntu on Nightly: Download source archive from GitHub actions. % mv mroonga-14.07.tar.gz mroonga/ % cd mroonga/packages % rake ubuntu DPUT_CONFIGURATION_NAME=groonga-ppa-nightly DPUT_INCOMING="~groonga/ubuntu/nightly" LAUNCHPAD_UPLOADER_PGP_KEY=xxxxxxx Tagging for release Execute the following command for tagging: % rake release:tag Upload archive files Execute the following command for uploading source archive: % cd packages/source % rake source As a result, tar.gz archive file is available from - https://packages.groonga.org/source/mroonga/. Create packages for the release Create Linux and Windows packages. Debian Change working directory to packages: % cd packages Execute the following command: % rake apt Now we finish build and upload packages to - https://packages.groonga.org/. However, these packages are unsigned. We sign packages by executing the below commands: % cd $PACKAGES_GROONGA_ORG_REPOSITORY % rake apt Debian derivatives(Ubuntu) For Ubuntu, packages are provided by PPA on launchpad.net. Change working directory to packages and execute rake ubuntu:upload command: % cd packages % rake ubuntu When upload packages was succeeded, package build process is executed on launchpad.net. Then build result is notified via E-mail. You can install packages via Groonga PPA on launchpad.net: https://launchpad.net/~groonga/+archive/ubuntu/ppa Red Hat derivatives Change working directory to packages % cd packages Execute the following command: % rake yum Now we finish build and upload packages to - https://packages.groonga.org/. However, these packages are unsigned. We sign packages by executing the below commands: % cd $PACKAGES_GROONGA_ORG_REPOSITORY % rake yum Windows For windows packages, we use artifacts of GitHub release page . Update Docker images TODO Upload documents Execute the following command: % rake release:document:update BUILD_DIR=${MROONGA_BUILD_DIR} MROONGA_ORG_DIR=${MROONGA_ORG_DIR} Commit changes in mroonga.org repository && push them. Update blog(Mroonga blog) We update the below files. • $MROONGA_GITHUB_COM_PATH/ja/_posts/(the date of re- lease)-mroonga-(version).md • $MROONGA_GITHUB_COM_PATH/en/_posts/(the date of re- lease)-mroonga-(version).md We can confirm contents of blog on Web browser by using Jekyll.: % jekyll serve We access http://localhost:4000 on our web browser for confirming con- tents. NOTE: If we want private to blog contents, we set false on published: in .md file.: --- layout: post.en title: Mroonga 10.01 has been released! description: Mroonga 10.01 has been released! published: false --- Announce release for mailing list Send release announce for each mailing list • ml@mysql.gr.jp for Japanese • groonga-dev@lists.osdn.me for Japanese • groonga-talk@lists.sourceforge.net for English Announce release for twitter Click Tweet link in Mrooga blog entry. You can share tweet about latest release. If you use tweet link, title of release announce and URL is embedded into your tweet. Execute sharing tweet in Japanese and English version of blog entry. Note that this tweet should be done when logged in by groonga account. Announce release for Facebook We announce release from Mroonga group in Facebook. https://www.facebook.com/mroonga/ Bump version Bump version to the latest release: % rake dev:version:bump NEW_VERSION=xx.xx TODO: Translate. 1Mroonga • • TODO: gdb C++C : ha_mroonga.cpp C: mrn_sys.c "_" : ha_mroonga.cpp : HA_MROONGA.cpp "_": hamroonga.cpp "-": ha-mroonga.cpp .cpp : ha_mroonga.cpp .cc : ha_mroonga.cc .hpp : ha_mroonga.hpp .h : ha_mroonga.h using namespace using namespace std : ha_mroonga.cpp: using namespace std; using namespace : ha_mroonga.hpp: using namespace std; std using namespace : ha_mroonga.cpp: using namespace zmq; include 2 include HEADER_FILE_NAME_HPP_ : mrn_db_path.hpp: #ifndef MRN_DB_PATH_HPP_ #define MRN_DB_PATH_HPP_ ... #endif // MRN_DB_PATH_HPP_ : mrn_db_path.hpp: #ifndef mrn_db_path_hpp_ #define mrn_db_path_hpp_ ... #endif // mrn_db_path_hpp_ : mrn_db_path.hpp: #ifndef _MRN_DB_PATH_HPP #define _MRN_DB_PATH_HPP ... #endif // _MRN_DB_PATH_HPP = 1 : int i = 0; : for (i=0; i<10; ++i) {...} 2: int i = 0; \0 : char *raw_data = "table_name column_name column_value" char *column_name; size_t column_name_size; column_name = raw_data + strlen("table_name "); column_name_size = strlen("column_name"); \0 : char *raw_data = "table_name column_name column_value" char *column_name; column_name = strndup(raw_data + strlen("table_name "), strlen("column_name")); \0 \0 : char *database_path = "db/test.mrn"; \0 : char *database_path = "db/test.mrn"; size_t database_path_size = strlen("db/test.mrn"); std::string : char database_path[MAX_PATH]; std::string : std::string database_path; TODO: GRN_BULK UpperCamelCase : class MyClass { } snail_case : class my_class { } ha_mroonga MySQL : class ha_mroonga: public handler { } UpperCamelCase : class HaMroonga: public handler { } snail_case _ : class MyClass { char *my_name_; } UpperCamelCase : class MyClass { char *MyName_; } : class MyClass { char *my_name; } : class MyClass { char *my_name_; const char *my_name() {return my_name_;}; } : class MyClass { char *my_name_; const char *my_name_() {return my_name_;}; } get_ : class MyClass { char *my_name_; const char *_my_name() {return my_name_;}; } set_ : class MyClass { unsigned int age_; void set_age(unsigned int age) { age_ = age; }; } : class MyClass { unsigned int age_; void set_age_(unsigned int age) { age_ = age; }; } set_ update_ : class MyClass { unsigned int age_; void update_age(unsigned int age) { age_ = age; }; } : class MyClass { private: MyClass(const MyClass &); } : class MyClass { } : class MyClass { unsigned int age_; MyClass(const MyClass &object) { age_ = object.age_; } } : class MyClass { private: MyClass &operator=(const MyClass &); } : class MyClass { } : class MyClass { unsigned int age_; MyClass &operator=(const MyClass &object) { age_ = object.age_; return *this; } } void void : class MyClass { unsigned int age_; unsigned int age() { return age_; }; } void : class MyClass { unsigned int age_; unsigned int age(void) { return age_; }; } const const : class Table { void insert(unsigned int id, const char *column_name, const char *value) { Record *record = records[i]; Column *column = columns[column_name]; column.set_value(value); } } const : class Table { void insert(unsigned int id, char *column_name, char *value) { Record *record = records[i]; Column *column = columns[column_name]; column.set_value(value); } } const : const char *MRN_LOG_FILE_PATH = "groonga.log"; #define : #define MRN_LOG_FILE_PATH "groonga.log" bool bool : bool is_searching; int : int is_searching; true false true false : bool is_searching = true; 0 : bool is_searching = 1; boolean_value == true boolean_value : boolean is_searching = true; if (!is_searching) { ... } : boolean is_searching = true; if (is_searching == false) { ... } NULL NULL value == NULL !value NULL Lisp nil NULL : char *name = NULL; if (!name) { ... } NULL : char *name = NULL; if (name == NULL) { ... } CC++ 0 0 strcmp(...) == 0 C++ bool CAPI int int int : if (memcmp(value1, value2, value_size) == 0) { printf("same value!\n"); } 0 : if (!memcmp(value1, value2, value_size)) { printf("same value!\n"); } : class Table { Table(const char *name); std::string name_; } Table::Table(const char *name) : name_(name) { } std::string(name) name_ = ... 2: class Table { Table(const char *name); std::string name_; } Table::Table(const char *name) { name_ = std::string(name); } : std::string name("users"); std::string() name = ... 2: std::string name; name = std::string("users"); int : for (int i = 0; i < 10; ++i) { } : for (int i = 0; i < 10; ++i) { } C++ C const C++ const const_cast static_cast static_cast const rein- terpret_cast const_cast : uchar *to_key; const ucahr *from_key; KEY *key_info; uint key_length; key_copy(to_key, const_cast<uchar *>from_key, key_info, key_length); static_cast : int n_hits = 1; int n_documents = 10; float hit_ratio = (float)(n_hits) / n_documents; static_cast reinterpret_cast : THD *thread = current_thd; my_hash_delete(&mrn_allocated_thds, reinterpret_cast<uchar *>(thread)); C: int n_hits = 1; int n_documents = 10; float hit_ratio = (float)(n_hits) / n_documents; static_cast reinterpret_cast : void *value = get_value(key); char *name; name = reinterpret_cast<char *>(value); * & C * 2 * : char* key, *value; & * & : char *key; : bool is_exist(const std::string &file_name); : char* key; • 1 • () • • • 2 1003 • • • assert • implicit template instantiation • typeid • MySQLOKMroonga MySQL How we are developing The Mroonga project uses Redmine for Ticket Driven Development. The location of our Redmine site is the following. http://redmine.groonga.org/projects/mroonga Developments are done per ticket. We would like to ask developers to register on the site above. Roadmap We develop based on the roadmap, and you can see our roadmap in the following page. http://redmine.groonga.org/projects/mroonga/roadmap We draw up our roadmap by discussing in off-line or on-line meetings. The roadmap specifies the list of adding features in each version. Source code management We manage the source code using Git on Github. http://github.com/mroonga/mroonga If you want to contribute, please make clone in read/write mode, and then commit and push. If you still do not have your account on github.com, please register. Development process We develop Mroonga using the following procedures. 1. Create a new ticket and add a description 2. Discuss in the development meeting and update roadmap 3. Assign the ticket, and the assignee starts development 4. Implement features or fix bugs, then push to the repository if needed 5. Add and run test codes, then push to the repository 6. Review the implementation and tests, and go back to 4. if needed 7. Add or update documents, then push to the repository 8. Close the ticket We welcome your ideas about new features or changes of specifications. Please create a ticket first and describe your idea there. For adding and running test codes, please refer the description below. We want to ask the assignee to handle whole the process, like design, implementation, test and documentations. Then you ask other developers to review (especially in case of adding new features), and when it passes you add or update documents and the ticket can be closed. For adding and updating documents, please refer the description below. Development environment Currently we are assuming the following development environments. • Linux x86_64 • glibc • MySQL • Groonga • Cutter (for C/C++ unit tests) • Sphinx (for documents) Contents of the source tree There are just a few source files for now, and we would like to keep it simple as possible. ha_mroonga.hpp The header file of Mroonga. ha_mroonga.cpp The implementation of Mroonga. mrnsys.hpp The header file of utility functions mrnsys.cpp The implementation of utility functions mysql-test/ The directory for tests by SQL SQL test scripts are included in each test case 't/' directory, that are also the definition of currently available SQL state- ments. The expected results of SQL tests are included in each test case 'r/' directory , that are also the definition of the current specification of supported SQL. test/unit/ The directory for per-function unit tests by C/C++ doc/source/ English documents in Sphinx format Since we are still in the early stage of the development of Mroonga, we will not make the documents of the specification of SQL queries for now. Alternatively we consider our SQL tests and its expected results as the list of features and the definition of their specifications. Adding and running tests We use two kinds of regression tests to manage the quality of Mroonga. SQL tests When you add features or fix bugs that can be confirmed by SQL queries, please always add SQL tests. You might think that per- formance improvements cannot be confirmed with SQL, but some can be still well tested by using status variables or informa- tion_schema plugin etc. C/C++ unit tests They are function level regression tests using Cutter. When you add features that cannot differ SQL queries' results, like util- ity functions, please add tests here. Before pushing to the repository, please always run the regression tests and confirm that you don't introduce any degradation. You can invoke these two kinds of tests by "make check". SQL tests are implemented as "sub test suite" for "mysql-test" in MySQL's regression tests. For the detail about how to add test cases or how to modify expected result files, please refer the following MySQL document. http://dev.mysql.com/doc/mysqltest/2.0/en/index.html For the detail about C/C++ unit tests, please refer the following Cut- ter document. http://cutter.sourceforge.net/ Adding and updating documents See How to contribute in documentation topics about adding and updating documents. • Index • Module Index • Search Page AUTHOR Mroonga Project COPYRIGHT 2009-2025, Mroonga Project mrn_version Jan 29, 2025 MROONGA(1)
NAME | THE CHARACTERISTICS OF MROONGA | INSTALL | UPGRADE | TUTORIAL | FOOTNOTES | FOOTNOTES | REFERENCE | FAQ | COMMUNITY | HOW TO CONTRIBUTE TO MROONGA | DEVELOPER'S GUIDE | AUTHOR | COPYRIGHT
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=mroonga&sektion=1&manpath=FreeBSD+Ports+15.0>
