RESTBase/service migration
In order to be able to eventually remove the RESTBase service, all services that rely on it need to be migrated so they no longer require RESTBase. This means that all functionality that the services relies on RESTBase for has to be covered elsewhere. Generally, "elsewhere" means in the service itself (possibly using a shared library), or in a gateway or proxy that sits in front of the service (compare wikitech:REST Gateway).
Migration Quick-Guide
[edit]- rate limiting: RESTBase provides basic per-IP, per-instance rate limiting. Limits can be configured for each endpoint. If needed, functionality can be covered either in the gateway, or in service-runner (if the service is implemented using service-runner).
- security headers: RESTBase applies generic security headers (CSP and such) based on the responses content-type. If needed, functionality can be covered either in the gateway, or directly in the service itself.
- enrichment/hydration: RESTBase has the ability to enrich the response body with information retrieved from other services. If this is needed, it has to be re-implemented in the service itself (possibly using the middleware pattern). The service proxy should be used when communitcating with other services.
- title normalization: RESTBase can check and normalize page titles by making a query to the MediaWiki action API, before forwarding a request to a service. If this is needed, it has to be re-implemented in the service itself (typically using the library for accessing the MediaWiki API provided by the service template). However, if the service will be using the title to communicate to a MediaWiki API, this may not be necessary - the target API may issue normalization redirects (from REST endpoints) or indicate title normalization in the response body (from action API modules). The service can rely on the information from these responses to perform title normalization.
- cache control: RESTBase can be used to add cache control headers, most importantly the
s-maxage
parameter to control the edge caches (Varnish) andmax-age
to control client-side caching. This will have to be re-implemented in the service itself. - edge cache purging: Services that make use of the edge caches with a large
s-maxage
value may be relying on RESTBase to actively purge cached data from the caches. This will have to be re-implemented in the service by emitting anresource-purge
event through EventGate. - persistence: If the service has been relying on RESTBase for persisence, this needs to be re-implemented inside the service. However, before ding so, the need for persistent storage should be re-assessed. If the storage is used as a transient cache, perhaps relying on the edge caches would be sufficient. If this is not the case, the following constraints should inform the choice and design of the storage system: atomicity, consitency, latency, total volume, read and write rates.
- pre-generation: If the service needs to pre-generate and store data that would be too slow to generate on the fly, and has been relying on RESTBase for this functionality, then this has to be re-implemented in the service. This can be done by either listening to kafka topics directly (typically, the
resource-change
topic), or by configuring changeprop to call the service in a way that will trigger re-generation.
Parsoid endpoints
[edit]The transition to the new Parsoid endpoint in MediaWiki entails an understanding and modification of the endpoint URL structure. The Parsoid endpoint is now represented by the /page
endpoint in MediaWiki core. The old endpoint follows the pattern https://en.wikipedia.org/api/rest_v1/page/html/{title}
, whereas the new endpoint adopts a different pattern: https://en.wikipedia.org/w/rest.php/v1/page/{title}/html
. In the new structure, the api/rest_v1
segment is replaced by w/rest.php/v1
, representing a new path to access the RESTful API. Additionally, the position of the /html
segment has been shifted to follow the {title}
segment. This reconfiguration aligns with a broader architectural change aimed at streamlining the API endpoint structure within MediaWiki.
The new /page
endpoint also provide outputs alternatives to html
here's a list of endpoints that can be accessed:
- https://en.wikipedia.org/w/rest.php/v1/page/{title} - JSON output containing wikitext content
- https://en.wikipedia.org/w/rest.php/v1/page/{title}/with_html - JSON output containing html content
- https://en.wikipedia.org/w/rest.php/v1/page/{title}/bare - JSON output with bare information and no content
- https://en.wikipedia.org/w/rest.php/v1/page/{title}/html - raw html (not within JSON)
Additionally, there's a revision-specific endpoint that has the same four possibilities as above but instead takes a revision ID (so can query old revisions). For example:
- https://en.wikipedia.org/w/rest.php/v1/revision/{revision-id}/with_html - JSON output containing html content
See also
[edit]A migration plan for each service has been layed out in the Services Architecture Recommendations (2019).
An overview can be found in the Services Re-architecture document.