Jump to content

Extension:External Data/データベース

From mediawiki.org
This page is a translated version of the page Extension:External Data/Databases and the translation is 14% complete.

You can use External Data to get data from outside databases; most relational DB types are supported, as well as MongoDB. The retrieval is done with either a set of parameters that together represent a SELECT statement, or the name of a prepared statement.

As of version 3.2, the recommended way to retrieve database data is to use one of the display functions (#external_value, #for_external_table, etc.), passing in the necessary parameters for the data retrieval, most notably "source=". You can also retrieve database data by calling the #get_db_data function, or (for version 3.0 and higher) #get_external_data. In all of these cases, you must specify the information, including login information, for the database in the variable $wgExternalDataSources in LocalSettings.php.

For any of these parser functions, you can also call its corresponding Lua function.

A note about security: If you are going to retrieve database data, you should think about the security implications. Configuring a database in LocalSettings.php will allow anyone with edit access to your wiki to run arbitrary SQL statements against that database, unless prepared statements are configured for the database connection. You should use a database user that has the minimum permissions for what you are trying to achieve. It is possible that complex SQL constructions could be passed to this function to cause it to do things vastly different from what it was designed for.

設定

Each database being accessed needs to be configured separately in LocalSettings.php. For normal databases (i.e., everything except for SQLite), add the following stanza for each database:

$wgExternalDataSources['ID'] = [
    'server' => 'server URL',
    'type' => 'DB type',
    'name' => 'DB name',
    'user' => 'username',
    'password' => 'password'
];

Where:

  • ID - a label for this database which is used when calling #get_db_data
  • server URL - the hostname on which the database lives
  • DB type - the type of database, i.e. mysql, postgres, mssql, oracle, sqlite, db2 or mongodb
  • DB name, username, password - details for accessing the database.

An example of a set of values would be:

$wgExternalDataSources['ID'] = [
    'server' => '127.0.0.1',
    'type' => 'mysql',
    'name' => 'employeesDatabase',
    'user' => 'guest',
    'password' => 'p@ssw0rd'
];

The following optional settings can also be added:

$wgExternalDataSources['ID']['flags'] = 'MediaWiki DB flags';
$wgExternalDataSources['ID']['prefix'] = 'table prefix';

Example values for these variables are:

$wgExternalDataSources['employee-db']['flags'] = DBO_NOBUFFER & DBO_TRX;
$wgExternalDataSources['employee-db']['prefix'] = 'emp_';

Support for database systems

MySQL, Postgres (i.e. PostgreSQL), DB2 and MongoDB should work fully by default (though there are syntax limitations, and differences, for MongoDB - see below). For MS SQL/SQL Server, SQLite and Oracle, you may need to perform some special handling.

Postgres

If you cannot connect to a PostgreSQL database, it may be because your PHP installation is lacking the PostgreSQL database module, php-pgsql. On many Linux systems, you can install it by calling the following, then restarting the web server:

This installs a php 5.5 dependency. PHP 5.5 is old and should not be used anymore. Replace php55 with the php version you are running.
yum install php55-php-pgsql

Amend the above configuration in LocalSettings.php to change the server type to "postgres":

$wgExternalDataSources['ID']['type'] = 'postgres';

SQLite

To connect to SQLite, you need something like the following in LocalSettings.php:

$wgExternalDataSources['ID'] = [
    'type' => 'sqlite',
    'directory' => '/directory/to/DB/file',
    'name' => 'Name of file, without .sqlite'
];

Oracle

Connecting to Oracle may work by default. If it doesn't work, the following may help:

  • Make sure that the Oracle client, and the PHP version being used, are using the same architecture: they have to either both be 32-bit, or both be 64-bit.
  • Make sure that the value of $edgDBServer for the installation matches something in the corresponding Oracle client .ora files. The value may need to look like "serverName/dbName", as opposed to "serverName".
  • If none of the above are the issue, you could try using the OdbcDatabase extension, which should work as well.

MongoDB

For MongoDB, there are no special connection parameters, although the username and password may be optional. There are two optional query parameters: aggregate and find query. Under PHP 7.*, the extension mongodb and library mongodb (composer require mongodb/mongodb) is required. Unfortunately, due to the way that MediaWiki continuous integration is built, this library cannot be simply added to composer.json for this extension (see T259743).

MongoDB is a non-SQL (or "NoSQL", if you prefer) database system, with its own querying language. When accessing MongoDB, you can either pass in a standard MongoDB query, or use the standard SQL-like syntax of #get_db_data. To use standard MongoDB querying, pass the query to the parameter |find query= or |aggregate=.

You can also use the standard querying functionality. There are some restrictions and differences, however, for the "where" clause:

  • only "AND"s can be used, not "OR"s
  • for the "LIKE" comparison, no text should be placed around the comparator - it should look like "Username LIKE Jo", not "Username LIKE '%Jo%'".

Because MongoDB returns values in JSON that may be complex, and contain compound values, you can get data that is stored in such a way by separating field names with dots. For instance, if the return data contains a value for a field called "Measurements" that is an array, holding values for fields called "Height" and "Width", then the "data=" parameter to #get_web_data could have a value like "height=Measurements.Height,width=Measurements.Width".

You can do Memcached- or APCU-based caching of values retrieved from MongoDB; to do that, you need the following two lines in LocalSettings.php:

$wgMainCacheType = CACHE_MEMCACHED /* or CACHE_ACCEL */;
$wgDataSources['MongoDB ID']['cache seconds'] = ''number of seconds'';

To enable ModgoDB under PHP 7.4, mongodb extension should be enabled (sudo apt install php-mongodb && sudo phpenmod mongodb; and also mongodb library should be installed with Composer: composer require mongodb/mongodb "^1.6.0" (this will be necessary until bug T259743 is resolved).

Microsoft SQL Server

MediaWiki formerly natively supported MS SQL Server; this support went away in version 1.34. However, starting with version 3.1, External Data provides its own support for accessing MS SQL Server via ODBC.

The typical settings for accessing MS SQL Server will be:

$wgExternalDataSources['mssql'] = [
   'driver'    => 'ODBC Driver 17 for SQL Server',
   'server'    => 'localhost,5500',
   'type'      => 'odbc',
   'name'      => 'NorthWind',
   'user'      => 'Test',
   'password'  => 'Test0000'
];

Note the 'driver' setting and the comma between domain name and port in the 'server' parameter. The odbc PHP extension has to be installed, as well as the Microsoft ODBC driver for SQL Server. The file /opt/microsoft/msodbcsql17/etc/odbcinst.ini should contain

[ODBC Driver 17 for SQL Server]
Description=Microsoft ODBC Driver 17 for SQL Server
Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.5.so.2.1

使用法

The examples below all refer to #get_db_data. To retrieve DB data using any of the other parser (or Lua) functions, you can use all of the same parameters, but specify "source=", instead of "db=", for the ID stored in LocalSettings.php.

To get data from an external database, call the following:

{{#get_db_data:
 db=database ID. This parameter can be passed as 'source' or even anonymously, provided there are no equal signs in it
 |from=from clause
 |where=where clause
 |limit=limit
 |order by=order by clause
 |group by=group by clause
 |data=data mappings
 |suppress error
}}

An explanation of the fields:

Parameter name Description Additional notes
db= The identifying label configured in LocalSettings.php
from= An SQL "FROM" clause, i.e. one or more tables Can be as simple as tableName or as complex as tableName1 = alias1, tableName2 = alias2, etc.
join on= Corresponds to an SQL "JOIN ... ON" clause; used if there is more than one table being queried. An example value would be tableName1.ID = tableName2.id_field, etc.
where= An SQL "WHERE" clause (optional)
limit= An SQL "LIMIT" clause, i.e. a number, limiting the number of results returned (optional)
order by= An SQL "ORDER BY" clause (optional)
data= Mapping of database column names to local variables. The syntax is localVariable1=databaseColumn1, localVariable2=databaseColumn2, …. In the example below, "employeeName" is the name of the database column and "name" is the local variable.

Unless the database is MongoDB, data can be omitted, in which case, the select statement will begin with SELECT * and all columns will be mapped to internal variables, with names in lowercase.

suppress error Prevents any error message from getting displayed if there is a problem retrieving the data (optional)

An example call, using the "employee database" example from above:

{{#get_db_data:
 db=employee-db
 |from=tblPersonalInfo
 |where=employeeID='{{{id}}}'
 |limit=50
 |order by=employeeName ASC
 |data=dbemail=employeeEmail,name=employeeName
}}

Prepared statements

A safer approach is to define one or more prepared statements for the database connections defined in LocalSettings.php, in 'prepared' item of the data source configuration array, which can be a string, containing a SQL query with parameters, for the only statement, or an associative array ([ 'query id' => 'SQL' ]), for several. Items of this associative array can be either strings holding the prepared statement that take no parameters or only parameters of the default type s (strings), or arrays of the form [ 'query' => 'SELECT ...' /* the prepared statement */, types => 'si' /* parameter types, one character per each */ ]. See mysqli_stmt::bind_param for parameter types. Setting parameter types is only relevant for mySQL.

Parameters to the prepared statement are passed as a comma-separated list in parser function argument parameters. If there are several prepared statements defined for the same connection, the needed statement ID is passed as query parameter. If prepared statements are defined, arbitrary queries will not be created for the same connection.

Examples

MySQL: only one statement allowed for the connection

$wgExternalDataSources['rfam2']	= [
	'server'   => 'mysql-rfam-public.ebi.ac.uk:4497',
	'type'     => 'mysql',
	'name'     => 'Rfam',
	'user'     => 'rfamro',
	'password' => '',
	'prepared' => <<<'SQL'
SELECT fr.rfam_acc, fr.rfamseq_acc, fr.seq_start, fr.seq_end
FROM full_region fr, rfamseq rf, taxonomy tx
WHERE rf.ncbi_id = tx.ncbi_id
AND fr.rfamseq_acc = rf.rfamseq_acc
AND tx.ncbi_id = ? -- the only parameter to the query.
AND is_significant = 1 -- exclude low-scoring matches from the same clan
LIMIT 20;
SQL,
	'types' => 's'
];
{{#get_db_data:db = rfam2
 | parameters=10116 <!-- this parameter is used to substitute question marks in the prepared statement -->
 | data=account=rfam_acc,sec=rfamseq_acc,start=seq_start,end=seq_end
}}

MySQL: several statements per connection

$wgExternalDataSources['rfam3']	= [
	'server'   => 'mysql-rfam-public.ebi.ac.uk:4497',
	'type'     => 'mysql',
	'name'     => 'Rfam',
	'user'     => 'rfamro',
	'password' => '',
	'prepared' => [
		// Parameter types set explicitly.
		'sequences' => [
			'query' => <<<'SEQ'
SELECT fr.rfam_acc, fr.rfamseq_acc, fr.seq_start, fr.seq_end
FROM full_region fr, rfamseq rf, taxonomy tx
WHERE rf.ncbi_id = tx.ncbi_id
AND fr.rfamseq_acc = rf.rfamseq_acc
AND tx.ncbi_id = ? -- the only parameter to the query.
AND is_significant = 1 -- exclude low-scoring matches from the same clan
SEQ,
			'types' => 's'
		],
		// Default parameter types.
		'sno' => <<<'SNO'
SELECT fr.rfam_acc, fr.rfamseq_acc, fr.seq_start, fr.seq_end, f.type
FROM full_region fr, rfamseq rf, taxonomy tx, family f
WHERE
rf.ncbi_id = tx.ncbi_id
AND f.rfam_acc = fr.rfam_acc
AND fr.rfamseq_acc = rf.rfamseq_acc
AND tx.tax_string LIKE ? -- the only parameter to the query.
AND f.type LIKE '%snoRNA%'
AND is_significant = 1 -- exclude low-scoring matches from the same clan
SNO
	]
];
{{#get_db_data: db = rfam3
 | query=sequences <!-- this parameter is used to choose one of the prepared statements -->
 | parameters=10116 <!-- this parameter is used to substitute question marks in the prepared statement -->
 | data=account=rfam_acc,sec=rfamseq_acc,start=seq_start,end=seq_end
}}

PostgreSQL

Download and restore 'dvdrental' example database. Create user 'ED' with password 'ED' and grant to it read rights on 'public' scheme in 'dvdrental' database.

Create the function films_shorter_than():

create or replace function films_shorter_than (int) returns table (title varchar, description text, length smallint)
language plpgsql as $$
begin
	return query select film.title, film.description, film.length
	from film
	where film.length <= $1;
end;
$$;

Add 'postgresql' database connection to LocalSettings.php:

$wgExternalDataSources['postgresql'] = [
	'server'    => 'localhost',
	'type'      => 'postgres',
	'name'      => 'dvdrental',
	'user'      => 'ED',
	'password'  => 'ED',
	'prepared'  => <<<'POSTGRE'
SELECT title, description, length
FROM films_shorter_than($1)
ORDER BY length ASC
LIMIT 25;
POSTGRE
];

Now, you can obtain a list of films shorter than 55 minutes with:

{{#get_external_data:
    db         = postgresql
  | limit      = 25
  | data       = title=title,description=description,length=length
  | parameters = 55
 }}
 {| class="wikitable"
 ! Title !! Description !! Length {{#for_external_table:<nowiki/>
 {{!}}-
 {{!}} {{{title}}} {{!}}{{!}} {{{description}}} {{!}}{{!}} {{{length}}}
 }}
 |}

Microsoft SQL Server

It is assumed that Microsoft SQL server and the ODBC driver for it are installed, as well as the odbc extension for PHP; and the Northwind database is imported.

/opt/microsoft/msodbcsql17/etc/odbcinst.ini should have the entry:

[ODBC Driver 17 for SQL Server]
Description=Microsoft ODBC Driver 17 for SQL Server
Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.5.so.2.1

Then necessary configuration setting for External Data will be:

$wgExternalDataSources['Northwind'] = [
   'driver'    => 'ODBC Driver 17 for SQL Server',
   'server'    => 'localhost,5500',
   'type'      => 'odbc',
   'name'      => 'NorthWind',
   'user'      => 'Test',
   'password'  => 'Test0000',
   'prepared'  => [
		'employees' => <<<'ODBC'
SELECT TOP 10 TitleOfCourtesy, FirstName, LastName, Title, City, COUNT(OrderID) AS NoOrders
FROM Employees LEFT JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
WHERE Employees.Region = ?
GROUP BY TitleOfCourtesy, FirstName, LastName, Title, City
ORDER BY NoOrders DESC;
ODBC
	]
];

and the parser function call retrieving and displayng the Washington employees data will be:

{{#get_db_data:
     db = Northwind
   | query = employees
   | parameters = WA
   | data = title = TitleOfCourtesy, first = FirstName, last = LastName, position = Title, city = City, orders = COUNT(OrderID) AS NoOrders
  }}
  {| class="wikitable"
   ! Name !! Position !! Orders !! City{{#for_external_table:<nowiki/>
   {{!}}-
   {{!}} {{{title}}} {{{first}}} {{{last}}}
   {{!}} {{{position}}}
   {{!}} {{{orders}}}
   {{!}} {{{city}}}
  }}
  |}


  • "MediaWiki mySQL Demo" - YouTube video by Matthew Shell showing the use of External Data with prepared statements for MySQL, June 2024