Yann Neuhaus

Subscribe to Yann Neuhaus feed Yann Neuhaus
Updated: 13 hours 12 min ago

Creating your private cloud using OpenStack – (4) – The Image and Placement services

Tue, 2025-01-21 08:14

By the end of the last post we finally got the first OpenStack service up and running: Keystone, the Identity Service. Going back to the list of services we need at a minimum, this still leaves us with some more to setup:

In this post we’ll setup the Image Service (Glance) and the Placement service. The Glance service is responsible for providing virtual machine images and enables users to discover, register, and retrieve them. Glance (as well as Keystone) exposes an API to query image metadata and to retrieve images.

There are several options for storing those virtual machine images (including block storage), but to keep this as simple as possible we’ll use a directory on the controller node.

Before we start, this is how our setup looks like currently:

We’ve done almost all the work on the controller node, and Glance is no exception to that. As with Keystone, Glance needs to store some stuff in PostgreSQL (remember that you also could use MySQL or SQLite), in this case metadata about the images. Very much the same as we did it for Keystone, we’ll create a dedicated user and database for that:

[root@controller ~]$ su - postgres -c "psql -c \"create user glance with login password 'admin'\""
CREATE ROLE
[root@controller ~]$ su - postgres -c "psql -c 'create database glance with owner=glance'"
CREATE DATABASE
[root@controller ~]$ su - postgres -c "psql -l"
                                                       List of databases
   Name    |  Owner   | Encoding | Locale Provider |   Collate   |    Ctype    | ICU Locale | ICU Rules |   Access privileges   
-----------+----------+----------+-----------------+-------------+-------------+------------+-----------+-----------------------
 glance    | glance   | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 keystone  | keystone | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 postgres  | postgres | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 template0 | postgres | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |             |             |            |           | postgres=CTc/postgres
 template1 | postgres | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |             |             |            |           | postgres=CTc/postgres
(5 rows)

The next step is creating the Glance service credentials:

[root@controller ~]$ . admin-openrc
[root@controller ~]$ openstack user create --domain default --password-prompt glance
User Password:
Repeat User Password:
No password was supplied, authentication will fail when a user does not have a password.
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| default_project_id  | None                             |
| domain_id           | default                          |
| email               | None                             |
| enabled             | True                             |
| id                  | eea5924c69c040428c5c4ef82f46c61b |
| name                | glance                           |
| description         | None                             |
| password_expires_at | None                             |
+---------------------+----------------------------------+
[root@controller ~]$  openstack service create --name glance --description "OpenStack Image" image
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| id          | db04f3f7c7014eb7883e074625d31391 |
| name        | glance                           |
| type        | image                            |
| enabled     | True                             |
| description | OpenStack Image                  |
+-------------+----------------------------------+

Create the endpoints:

[root@controller ~]$ openstack endpoint create --region RegionOne image public http://controller:9292
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | bde2499cd6c34c19aa221d64b9d0f2a0 |
| interface    | public                           |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | db04f3f7c7014eb7883e074625d31391 |
| service_name | glance                           |
| service_type | image                            |
| url          | http://controller:9292           |
+--------------+----------------------------------+
[root@controller ~]$ openstack endpoint create --region RegionOne image internal http://controller:9292
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 046e64889cda43c2bfc36471e40a7a2d |
| interface    | internal                         |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | db04f3f7c7014eb7883e074625d31391 |
| service_name | glance                           |
| service_type | image                            |
| url          | http://controller:9292           |
+--------------+----------------------------------+
[root@controller ~]$ openstack endpoint create --region RegionOne image admin http://controller:9292
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | c8ddc792772048a18179877355fdbd3f |
| interface    | admin                            |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | db04f3f7c7014eb7883e074625d31391 |
| service_name | glance                           |
| service_type | image                            |
| url          | http://controller:9292           |
+--------------+----------------------------------+

Install the operating system package:

[root@controller ~]$ dnf install openstack-glance -y

And finally the configuration of the Glance API service:

[root@ostack-controller ~]$ egrep -v "^#|^$" /etc/glance/glance-api.conf
[DEFAULT]
enabled_backends=fs:file
[barbican]
[barbican_service_user]
[cinder]
[cors]
[database]
connection = postgresql+psycopg2://glance:admin@localhost/glance
[file]
[glance.store.http.store]
[glance.store.rbd.store]
[glance.store.s3.store]
[glance.store.swift.store]
[glance.store.vmware_datastore.store]
[glance_store]
default_backend = fs
[healthcheck]
[image_format]
[key_manager]
[keystone_authtoken]
www_authenticate_uri  = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = glance
password = admin
[os_brick]
[oslo_concurrency]
[oslo_limit]
auth_url = http://controller:5000
auth_type = password
user_domain_id = default
username = glance
system_scope = all
password = admin
endpoint_id = 1e7748b2d7d44fb6a0c17edb3c68c4de
region_name = RegionOne
[oslo_messaging_amqp]
[oslo_messaging_kafka]
[oslo_messaging_notifications]
[oslo_messaging_rabbit]
[oslo_middleware]
[oslo_policy]
[oslo_reports]
[paste_deploy]
flavor = keystone
[profiler]
[task]
[taskflow_executor]
[vault]
[wsgi]
[fs]
filesystem_store_datadir = /var/lib/glance/images/

The endpoint_id is the public image endpoint ID:

[root@controller ~]$ openstack endpoint list --service glance --region RegionOne
+----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------+
| ID                               | Region    | Service Name | Service Type | Enabled | Interface | URL                    |
+----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------+
| bde2499cd6c34c19aa221d64b9d0f2a0 | RegionOne | glance       | image        | True    | public    | http://controller:9292 |
| 046e64889cda43c2bfc36471e40a7a2d | RegionOne | glance       | image        | True    | internal  | http://controller:9292 |
| c8ddc792772048a18179877355fdbd3f | RegionOne | glance       | image        | True    | admin     | http://controller:9292 |
+----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------+

Make sure that the glance account has reader access to system-scope resources (like limits):

[root@controller ~]$ openstack role add --user glance --user-domain Default --system all reader

Populate the Image service database:

[root@controller ~]$ /bin/sh -c "glance-manage db_sync" glance
2025-01-20 14:20:25.319 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.319 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
2025-01-20 14:20:25.327 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.328 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
2025-01-20 14:20:25.340 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.340 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
2025-01-20 14:20:25.350 35181 INFO alembic.runtime.migration [-] Running upgrade  -> liberty, liberty initial
2025-01-20 14:20:25.526 35181 INFO alembic.runtime.migration [-] Running upgrade liberty -> mitaka01, add index on created_at and updated_at columns of 'images' table
2025-01-20 14:20:25.534 35181 INFO alembic.runtime.migration [-] Running upgrade mitaka01 -> mitaka02, update metadef os_nova_server
2025-01-20 14:20:25.571 35181 INFO alembic.runtime.migration [-] Running upgrade mitaka02 -> ocata_expand01, add visibility to images
2025-01-20 14:20:25.575 35181 INFO alembic.runtime.migration [-] Running upgrade ocata_expand01 -> pike_expand01, empty expand for symmetry with pike_contract01
2025-01-20 14:20:25.576 35181 INFO alembic.runtime.migration [-] Running upgrade pike_expand01 -> queens_expand01
2025-01-20 14:20:25.576 35181 INFO alembic.runtime.migration [-] Running upgrade queens_expand01 -> rocky_expand01, add os_hidden column to images table
2025-01-20 14:20:25.580 35181 INFO alembic.runtime.migration [-] Running upgrade rocky_expand01 -> rocky_expand02, add os_hash_algo and os_hash_value columns to images table
2025-01-20 14:20:25.584 35181 INFO alembic.runtime.migration [-] Running upgrade rocky_expand02 -> train_expand01, empty expand for symmetry with train_contract01
2025-01-20 14:20:25.585 35181 INFO alembic.runtime.migration [-] Running upgrade train_expand01 -> ussuri_expand01, empty expand for symmetry with ussuri_expand01
2025-01-20 14:20:25.586 35181 INFO alembic.runtime.migration [-] Running upgrade ussuri_expand01 -> wallaby_expand01, add image_id, request_id, user columns to tasks table"
2025-01-20 14:20:25.598 35181 INFO alembic.runtime.migration [-] Running upgrade wallaby_expand01 -> xena_expand01, empty expand for symmetry with 2023_1_expand01
2025-01-20 14:20:25.600 35181 INFO alembic.runtime.migration [-] Running upgrade xena_expand01 -> yoga_expand01, empty expand for symmetry with 2023_1_expand01
2025-01-20 14:20:25.602 35181 INFO alembic.runtime.migration [-] Running upgrade yoga_expand01 -> zed_expand01, empty expand for symmetry with 2023_1_expand01
2025-01-20 14:20:25.603 35181 INFO alembic.runtime.migration [-] Running upgrade zed_expand01 -> 2023_1_expand01, empty expand for symmetry with 2023_1_expand01
2025-01-20 14:20:25.605 35181 INFO alembic.runtime.migration [-] Running upgrade 2023_1_expand01 -> 2024_1_expand01, adds cache_node_reference and cached_images table(s)
2025-01-20 14:20:25.677 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.677 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
Upgraded database to: 2024_1_expand01, current revision(s): 2024_1_expand01
2025-01-20 14:20:25.681 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.681 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
2025-01-20 14:20:25.684 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.684 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
Database migration is up to date. No migration needed.
2025-01-20 14:20:25.692 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.692 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
2025-01-20 14:20:25.699 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.699 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
2025-01-20 14:20:25.702 35181 INFO alembic.runtime.migration [-] Running upgrade mitaka02 -> ocata_contract01, remove is_public from images
2025-01-20 14:20:25.705 35181 INFO alembic.runtime.migration [-] Running upgrade ocata_contract01 -> pike_contract01, drop glare artifacts tables
2025-01-20 14:20:25.711 35181 INFO alembic.runtime.migration [-] Running upgrade pike_contract01 -> queens_contract01
2025-01-20 14:20:25.711 35181 INFO alembic.runtime.migration [-] Running upgrade queens_contract01 -> rocky_contract01
2025-01-20 14:20:25.712 35181 INFO alembic.runtime.migration [-] Running upgrade rocky_contract01 -> rocky_contract02
2025-01-20 14:20:25.712 35181 INFO alembic.runtime.migration [-] Running upgrade rocky_contract02 -> train_contract01
2025-01-20 14:20:25.712 35181 INFO alembic.runtime.migration [-] Running upgrade train_contract01 -> ussuri_contract01
2025-01-20 14:20:25.713 35181 INFO alembic.runtime.migration [-] Running upgrade ussuri_contract01 -> wallaby_contract01
2025-01-20 14:20:25.713 35181 INFO alembic.runtime.migration [-] Running upgrade wallaby_contract01 -> xena_contract01
2025-01-20 14:20:25.713 35181 INFO alembic.runtime.migration [-] Running upgrade xena_contract01 -> yoga_contract01
2025-01-20 14:20:25.714 35181 INFO alembic.runtime.migration [-] Running upgrade yoga_contract01 -> zed_contract01
2025-01-20 14:20:25.714 35181 INFO alembic.runtime.migration [-] Running upgrade zed_contract01 -> 2023_1_contract01
2025-01-20 14:20:25.715 35181 INFO alembic.runtime.migration [-] Running upgrade 2023_1_contract01 -> 2024_1_contract01
2025-01-20 14:20:25.717 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.717 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
Upgraded database to: 2024_1_contract01, current revision(s): 2024_1_contract01
2025-01-20 14:20:25.719 35181 INFO alembic.runtime.migration [-] Context impl PostgresqlImpl.
2025-01-20 14:20:25.719 35181 INFO alembic.runtime.migration [-] Will assume transactional DDL.
Database is synced successfully.

Create the policy definition to allow creating an image:

[root@controller ~]$ cat /etc/glance/policy.yaml 
{
    "default": "",
    "add_image": "role:admin",
    "modify_image": "role:admin",
    "delete_image": "role:admin"
}

Enable and start the service:

[root@controller ~]$ systemctl enable openstack-glance-api.service
[root@controller ~]$ systemctl start openstack-glance-api.service

Verify operation of the Image service using CirrOS, a small Linux image that helps you test your OpenStack deployment:

[root@controller ~]$ mkdir -p /var/cache/glance/api/
[root@controller ~]$ . admin-openrc
[root@controller ~]$ wget http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img
[root@controller ~]$ glance image-create --name "cirros" --file cirros-0.4.0-x86_64-disk.img   --disk-format qcow2 --container-format bare   --visibility=public
+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | 443b7623e27ecf03dc9e01ee93f67afe                                                 |
| container_format | bare                                                                             |
| created_at       | 2025-01-20T14:15:21Z                                                             |
| disk_format      | qcow2                                                                            |
| id               | 150fd48b-8ed4-4170-ad98-213d9eddcba0                                             |
| min_disk         | 0                                                                                |
| min_ram          | 0                                                                                |
| name             | cirros                                                                           |
| os_hash_algo     | sha512                                                                           |
| os_hash_value    | 6513f21e44aa3da349f248188a44bc304a3653a04122d8fb4535423c8e1d14cd6a153f735bb0982e |
|                  | 2161b5b5186106570c17a9e58b64dd39390617cd5a350f78                                 |
| os_hidden        | False                                                                            |
| owner            | 920bf34a6c88454f90d405124ca1076d                                                 |
| protected        | False                                                                            |
| size             | 12716032                                                                         |
| status           | active                                                                           |
| stores           | fs                                                                               |
| tags             | []                                                                               |
| updated_at       | 2025-01-20T14:15:22Z                                                             |
| virtual_size     | 46137344                                                                         |
| visibility       | public                                                                           |
+------------------+----------------------------------------------------------------------------------+

[root@controller ~]$ glance image-list
+--------------------------------------+--------+
| ID                                   | Name   |
+--------------------------------------+--------+
| 150fd48b-8ed4-4170-ad98-213d9eddcba0 | cirros |
+--------------------------------------+--------+

Fine, the image is available and now we have this:

The next service we need to deploy is the Placement service. This service is used by other services to manage and allocate their resources. Like the Keystone and Glance service, this service needs the database backend:

[root@controller ~]$ su - postgres -c "psql -c \"create user placement with login password 'admin'\""
CREATE ROLE
[root@controller ~]$ su - postgres -c "psql -c 'create database placement with owner=placement'"
CREATE DATABASE
[root@controller ~]$ su - postgres -c "psql -l"
                                                        List of databases
   Name    |   Owner   | Encoding | Locale Provider |   Collate   |    Ctype    | ICU Locale | ICU Rules |   Access privileges   
-----------+-----------+----------+-----------------+-------------+-------------+------------+-----------+-----------------------
 glance    | glance    | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 keystone  | keystone  | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 placement | placement | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 postgres  | postgres  | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | 
 template0 | postgres  | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | =c/postgres          +
           |           |          |                 |             |             |            |           | postgres=CTc/postgres
 template1 | postgres  | UTF8     | libc            | en_US.UTF-8 | en_US.UTF-8 |            |           | =c/postgres          +
           |           |          |                 |             |             |            |           | postgres=CTc/postgres
(6 rows)

In the same way as with the Glance service, the user, role, service and endpoints need to be created:

[root@controller ~]$ . admin-openrc
[root@controller ~]$ openstack user create --domain default --password-prompt placement
User Password:
Repeat User Password:
No password was supplied, authentication will fail when a user does not have a password.
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| default_project_id  | None                             |
| domain_id           | default                          |
| email               | None                             |
| enabled             | True                             |
| id                  | 9d1de7fda54a441b9c1289f8dc520e2b |
| name                | placement                        |
| description         | None                             |
| password_expires_at | None                             |
+---------------------+----------------------------------+
[root@controller ~]$ openstack role add --project service --user placement admin
[root@controller ~]$ openstack service create --name placement --description "Placement API" placement
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| id          | 43865e881fb84730b2bfc7c099c8038d |
| name        | placement                        |
| type        | placement                        |
| enabled     | True                             |
| description | Placement API                    |
+-------------+----------------------------------+

[root@controller ~]$ openstack endpoint create --region RegionOne placement public http://controller:8778
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 5dac42d9532e4fd7b34a8e990ec5d408 |
| interface    | public                           |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 43865e881fb84730b2bfc7c099c8038d |
| service_name | placement                        |
| service_type | placement                        |
| url          | http://controller:8778           |
+--------------+----------------------------------+

[root@controller ~]$ openstack endpoint create --region RegionOne placement internal http://controller:8778
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 535c652f6e654f56843076cf91a0fe84 |
| interface    | internal                         |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 43865e881fb84730b2bfc7c099c8038d |
| service_name | placement                        |
| service_type | placement                        |
| url          | http://controller:8778           |
+--------------+----------------------------------+

[root@controller ~]$ openstack endpoint create --region RegionOne placement admin http://controller:8778
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 58370b01f8904e1f8aab8c718c8a4cb6 |
| interface    | admin                            |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 43865e881fb84730b2bfc7c099c8038d |
| service_name | placement                        |
| service_type | placement                        |
| url          | http://controller:8778           |
+--------------+----------------------------------+

Install the package which brings the Placement API and configure the service:

[root@controller ~]$ dnf install openstack-placement-api -y
[root@controller ~]$ egrep -v "^#|^$" /etc/placement/placement.conf
[DEFAULT]
[api]
auth_strategy = keystone
[cors]
[keystone_authtoken]
auth_url = http://controller:5000/v3
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = placement
password = admin
www_authenticate_uri  = http://controller:5000
[oslo_middleware]
[oslo_policy]
[placement]
[placement_database]
connection = postgresql+psycopg2://placement:admin@localhost/placement
[profiler]
[profiler_jaeger]
[profiler_otlp]

Populate the database (this does not produce any output if successful):

[root@controller ~]$ sh -c "placement-manage db sync" placement

… and restart the web server after adding this block:

<Directory /usr/bin>
   <IfVersion >= 2.4>
      Require all granted
   </IfVersion>
   <IfVersion < 2.4>
      Order allow,deny
      Allow from all
   </IfVersion>
</Directory>

… to the end of “/etc/httpd/conf.d/00-placement-api.conf”.

[root@controller ~]$ systemctl restart httpd

If all went fine and was configured correctly, the service can be verified to be working with:

[root@controller ~]$ placement-status upgrade check
+-------------------------------------------+
| Upgrade Check Results                     |
+-------------------------------------------+
| Check: Missing Root Provider IDs          |
| Result: Success                           |
| Details: None                             |
+-------------------------------------------+
| Check: Incomplete Consumers               |
| Result: Success                           |
| Details: None                             |
+-------------------------------------------+
| Check: Policy File JSON to YAML Migration |
| Result: Success                           |
| Details: None                             |

[root@controller ~]$ dnf install python3-osc-placement -y
[root@controller ~]$ openstack --os-placement-api-version 1.2 resource class list
+----------------------------------------+
| name                                   |
+----------------------------------------+
| VCPU                                   |
| MEMORY_MB                              |
| DISK_GB                                |
| PCI_DEVICE                             |
| SRIOV_NET_VF                           |
| NUMA_SOCKET                            |
| NUMA_CORE                              |
| NUMA_THREAD                            |
| NUMA_MEMORY_MB                         |
| IPV4_ADDRESS                           |
| VGPU                                   |
| VGPU_DISPLAY_HEAD                      |
| NET_BW_EGR_KILOBIT_PER_SEC             |
| NET_BW_IGR_KILOBIT_PER_SEC             |
| PCPU                                   |
| MEM_ENCRYPTION_CONTEXT                 |
| FPGA                                   |
| PGPU                                   |
| NET_PACKET_RATE_KILOPACKET_PER_SEC     |
| NET_PACKET_RATE_EGR_KILOPACKET_PER_SEC |
| NET_PACKET_RATE_IGR_KILOPACKET_PER_SEC |
+----------------------------------------+

Fine, now we have the Image and Placement service up and running and our setup looks like this:

In the next post we’ll continue with setting up the compute service (Nova).

L’article Creating your private cloud using OpenStack – (4) – The Image and Placement services est apparu en premier sur dbi Blog.

Automate your Deployments in Azure with Terraform!

Thu, 2025-01-16 07:43

Terraform is a strong open-source declarative and platform agnostic infrastructure as code (IaC) tool developed by HashiCorp. It facilitates the deployment and whole management of infrastructure. In this hands on blog I will show you how you can use Terraform to automate your cloud deployments in Azure.

Initial Setup:

For this blog I’m using a ubuntu server as a automation server where I’m running Terraform. You can install Terraform on different operating systems. For instructions how to install Terraform check out this link from HashiCorp.

Starting with the hands on part I’m creating a new dedicated directory for my new Terraform project:

Within this new directory I’m creating the following files which will hold my configuration code:

  • main.tf
  • providers.tf
  • variables.tf
  • outputs.tf
Installing the Azure CLI:

For the authentication with Azure I’m using the Azure CLI command line tool. You can Install the Azure CLI on Ubuntu with one command which curls a script, provided by Microsoft, from the internet and executes it on your system:

curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

To get more information’s about how to install the Azure CLI on your system, checkout this link from Microsoft.

As far as the installation is successfully done, you can verify it with the following command:

az –version

Then use the following command for connecting to Azure:

az login

This command will open a browser window where you can sign in to Azure:

After you successfully authenticated yourself to Azure, you can check your available subscriptions with the following command:

az account list

Initialize Terraform:

As the Azure CLI is now installed on the system and we are successfully authenticated to Azure, we can now start with the configuration of Terraform and the required provider for interacting with the Azure cloud platform.

Therefore I add the code block below into the providers.tf file which will tell terraform to install and initialize the azurerm provider with the specific version 4.10.0, which is the latest at the moment:

To configure the azurerm provider, I add the provider code block below additionally into the providers.tf file.

You can find your subscription ID in the output from the “az account list” command above.

After inserting those code blocks into the providers.tf file, we can install the defined azurerm provider and initialize Terraform by running the below command in our project directory:

terraform init

Terraform Workspaces:

As Terraform is now successfully initialized and the required provider is installed, we can start with the development of our infrastructure code.

But before doing so I would like to target the concept of workspaces in Terraform. Workspaces enables you to use the same configuration code for multiple environments through separate state files. You can imagine workspaces as a separated deployment environment and you terraform code as a independent plan or image of your infrastructure. As an example, imagine you added a new virtual machine to your terraform code and deployed it in production. If you now want to have the same virtual machine for test purposes, you just have to switch into your test workspace and run the terraform code again. You will have the exact same virtual machine within a few minutes!

To check the workspaces you have in your Terraform project, use the following command:

terraform workspace list

As you can see we just have the default workspace in our new Terraform project. I want to deploy my infrastructure in this hands on blog post for multiple environments, therefore I will create some new workspaces. Lets assume we have a development, test and production stage for our infrastructure. I will create therefore the workspaces accordingly with the commands below:

terraform workspace new development

terraform workspace new test

terraform workspace new production

After executing these commands, we can now check again our available workspaces in our terraform project:

Note that terraform will let you know your current workspace through the “*” symbol behind the particular workspace. We want to deploy our infrastructure for development first. So I will switch back into the development workspace with the following command:

terraform workspace select development

Create an Azure Resource Group:

As the workspaces are now successfully created, we can start with our configuration code.

First of all I go into the variables.tf file and add the variable code block below to that file:

I will use this “env” variable for the suffix or prefix of resource names, which I will deploy, to simple recognize to which environment these resources belong.

Next I will create a resource group in Azure. Therefore I add the code block below to the main.tf file:

As you can see I set the name of the resource group dynamically with the prefix “RG_” and the value for the current workspace in the variable “env”, which I’ve defined before in the variables.tf file. The variable “terraform.workspace” is a default variable which refers to the current workspace.

To check which resource terraform would create in case we would apply the current configuration code, we can run the following command:

terraform plan

We can see that terraform would create a new resource group with the name “RG_DEV”.

Create a Virtual Network and Subnets:

Next I will create a virtual network. Therefore I add the variable code block below to the variables.tf file. This variable defines for each environment stage a own address space:

I add now the code block below to the main.tf file to create a virtual network:

As you can see I’m referencing here as well to the “env” variable for dynamically setting the suffix of the network name and as well to the new “cidr” variable to set the address space of the virtual network.

Next I will create some subnets within the virtual network. I want to create 4 subnets in total:

  • A front tier subnet
  • A middle tier subnet
  • A backend tier subnet
  • A bastion subnet for the administration

Therefore I add the variable below to my variables.tf file, which defines for each environment stage and subnet an address space:

Next I will add for each subnet a new resource block to the main.tf file:

Note that I enabled in the backend tier subnet the option “private_endpoint_network_policies”. This is a option which enforces the network security groups to take effect on the private endpoints in the particular subnet. Checkout this link from Microsoft for more information’s about this option.

Create an Azure SQL Database:

Next I will create an Azure SQL Server. Therefore I add the variable below to my variables.tf file. This variable is supposed to hold the admin password of the Azure SQL Server. I set the sensitivity option for this variable which will prevent the password to be exposed in the terminal output or in the Terraform logs:

I did also not set any value in the configuration files, instead I will set the variable value as a environment variable before applying the configuration.

Next I add the code block below to my main.tf file:

As you can see I referenced the “sqlserver_password” variable to set the password for the “sqladmin” user. I also disabled the public network access to prevent database access over the public endpoint of the server. I will create instead a private endpoint later on.

Next I will create the Azure SQL Database. Therefore I add the variable below to my variables.tf file:

The thought behind this variable is, that we have different requirements for the different stages. The general purpose SKU is sufficient for the non-productive databases but for the productive one we want the business critical service tier. As well as we want to have 30 days of point in time recovery for our productive data while 7 days is sufficient for non-productive and we want to store our productive database backups on geo-zone redundant storage while zone redundant storage is sufficient for the non-productive databases.

Then I add the resource block below into my main.tf file:

As you can see I’m referencing to my “database_settings” variable to set the configuration options dynamically.

Create a DNS Zone and a Private Endpoint:

For name resolution I will next create a private DNS zone. For that I add the resource block below to my main.tf file:

To associate this private DNS zone now with my virtual network, I will next create a virtual network link. Therefore I add the resource block below to my main.tf file:

To be able to securely connect to my azure sql database I will now create a private endpoint in my backend subnet. Therefore I add the resource block below to my main.tf file:

With this configuration code, I create a private endpoint with the name of the Azure SQL Server and the suffix “-endpoint”. Through the option “subnet_id” I place this endpoint in the backend subnet with a private service connection to the Azure SQL Server. I also associate the endpoint to the private DNS zone, which I’ve created just before, for name resolution.

Create an Azure Bastion:

Lets now continue and create an azure bastion host for the administration of our environment. Therefore I first create a public IP address through adding the resource block below to my main.tf file:

Next I create the bastion host itself. For that I add the code block below to my main.tf file:

Create a Virtual Machine:

Now I will add a virtual machine to my middle tier subnet. Therefore I need to create first a network interface for that virtual machine. The resource block below will create the needed network interface:

As the virtual machine, which I intend to create, needs an admin password like the azure sql server, I will create an additional password variable. Therefore I add the code block below to my variables.tf file:

To create the virtual machine itself, I add the resource block below to my main.tf file:

Create Network Security Groups and Rules:

Next I want to secure my subnets. Therefore I create for my front tier, middle tier and backend tier subnet a network security group by adding the resource blocks below to my main.tf file:

Next I create for each network security group particular rules.

Starting with the front tier subnet I want to block all Inbound traffic except traffic over https. Therefore I add the two resource blocks below to my main.tf file:

Continuing with the middle tier subnet I want to block all inbound traffic but allow http traffic only from the front tier subnet and allow rdp traffic only from the bastion subnet. Therefore I add the three resource blocks below to my main.tf file:

Last but not least I want to block all Inbound traffic to my backend tier subnet except traffic to the sql-server port from the middle tier subnet. In addition I want to block explicitly the internet access from this subnet.

You are questioning why I’m explicitly block internet access from this subnet while I haven’t any public IP address or NAT gateway for this subnet? That’s because Microsoft provides access to the internet through a default outbound IP address in case no explicit way is defined. That’s a feature which will be deprecated on the 30th September 2025. To get more information’s about this feature check out this link from Microsoft.

To create the rules for the backend tier subnet I add the three resource blocks below to my main.tf file:

Define Output Variables:

I will stop with the creation of resources for this blog post and will show you finally how you can define outputs. For example let’s assume we want to have the name of the Azure SQL Server and the IP address of the virtual machine extracted after the deployment. Therefore I add the two output variables below to the outputs.tf file:

Outputs are especially useful when you need to pass up information’s from the deployment to a higher context. For example when you are working with modules in terraform and you want to pass information’s from a child module to a parent module. In our case the outputs will just be printed out to the command line after the deployment.

Apply the Configuration Code:

As I am now done with the definition of the configuration code for this blog post, I will plan and apply my configuration for each stage. Before doing so, I need to first set a value for my password variables. On Ubuntu this can be done with this command:

export TF_VAR_sqlserver_password=”your password”

export TF_VAR_vm_password=”your password”

After I’ve set the variables, I run the terraform plan command and we can see that terraform would create 29 resources:

This seems to be good for my so I run the terraform apply command to deploy my infrastructure:

terraform apply

After some minutes of patience terraform applied the configuration code successfully:

Check the Infrastructure in the Azure Portal:

When I’m signing in to the Azure portal I can see my development resource group with all the resources inside:

I want to have my test and production resources as well so I switch the terraform workspace to test and production and run in both workspaces again the terraform apply command.

After some additional minutes of patience we can see in the Azure portal that we have now all resources for each environment stage:

Lets now compare the settings from the productive database with the development database and we can see that the SKU for the productive one is business critical while the SKU for the non-productive one is general purpose:

The Backup storage has also been set according to the “database_settings” variable:

We can see the same for the point in time recovery option:

We can see that our subnets are also all in place with the corresponding address space and network security group associated:

Lets check the private endpoint of the Azure SQL Server. We can see that we have a private IP address within our backend subnet which is linked to the Azure SQL Server:

Lets connect to the virtual machine and try a name resolution. You can see that we were able to successfully resolve the FQDN:

After installing SQL-Server management studio on the virtual machine, we can also connect to the Azure SQL Server through the FQDN of the private endpoint:

Delete the Deployed Resources:

For now preventing getting a high bill for something I didn’t use, I will now delete all resources which I’ve created with Terraform. This is very simple, and can be done through running the terraform destroy command in each workspace:

terraform destroy

I hope you got some interesting examples and ideas about Terraform and Azure! Feel free to share your questions and feelings about Terraform and Azure with me in the comment section below.

L’article Automate your Deployments in Azure with Terraform! est apparu en premier sur dbi Blog.

Creating your private cloud using OpenStack – (1) – Introduction

Thu, 2025-01-16 04:54

While public clouds are a trend since several years now, some companies are also looking into self hosted solutions to build a private cloud. Some do this because of costs, others do this because they don’t want to be dependent on one ore multiple public cloud providers, others do it because they want to keep their data locally. There are several solutions to this and depending on the requirements those might or might not be an option. Some of the more popular ones are:

The other major player which is not in the list above is OpenStack, which started in 2010 already. OpenStack is not a single product, but more a set of products combined together to provide a computing platform to deploy your workloads on top of either virtual machines, or containers, or a mix of both. There are plenty of sub projects which bring in additional functionality, check here for a list. The project itself is hosted and supported by the OpenInfra Foundation, which should give sufficient trust that it will stay as a pure open source project (maybe have a look at the OpenInfra supporting organizations as well, to get an idea of how widely it is adopted and supported).

The main issue with OpenStack is, that it is kind of hard to start with. There are so many services you might want to use that you probably get overwhelmed at the beginning of your journey. To help you a bit out of this, we’ll create a minimal, quick and dirty OpenStack setup on virtual machines with just the core services:

We’ll do that step by step, because we believe that you should know the components which finally make up the OpenStack platform, or any other stack you’re planning to deploy. There is also DevStack which is a set of scripts for the same purpose, but as it is scripted you’ll probably not gain the same knowledge than by doing it manually. There is OpensStack-helm in addition, which deploys OpenStack on top of Kubernetes, but this as well is out of cope for this series of blog posts. Canonical offers MicroStack, which also can be used to setup a test environment quickly.

Automation is great and necessary, but it also comes with a potential downside: The more you automate, the more people you’ll potentially have who don’t know what is happening in the background. This is usually fine as long as the people with the background knowledge stay in the company, but if they leave you might have an issue.

As there are quite some steps to follow, this will not be single blog post, but split into parts:

  • Introduction (this blog post)
  • Preparing the controller and the compute node
  • Setting up and configuring Keystone
  • Setting up and configuring Glance and the Placement service
  • Setting up and configuring Nova
  • Setting up and configuring Neutron
  • Setting up and configuring Horizon, the Openstack dashboard

In the most easy configuration, the OpenStack platform consists of two nodes: A controller node, and at least one compute node. Both of them will require two network interfaces, one for the so-called management network (as the name implies, this is for the management of the stack and communication with the internet), and the other one for the so-called provider network (this is the internal network e.g. the virtualized machines will be using to communicate with each other).

When it comes to your choice of the Linux distribution you want to deploy OpenStack on, this is merely a matter of taste. OpenStack can be deployed on many distributions, the official documentation comes with instructions for Red Hat based distributions (which usually includes Alma Linux, Rocky Linux and Oracle Linux), SUSE based distributions (which includes openSUSE Leap), and Ubuntu (which also should work on Debian). For the scope of this blog series we’ll go with a minimal installation of Rocky Linux 9, just because I haven’t used it since some time.

OpenStack itself is released in a six month release cycle and we’ll go with 2024.2 (Dalmatian), which will be supported until the beginning of 2026. As always, you should definitely go with the latest supported release so you have the most time to test and plan for future upgrades.

To give you an idea of what we’ve going to start with, here is a graphical overview:

Of course this very simplified, but it is enough to know for the beginning:

  • We have two nodes, one controller node and one compute node.
  • Both nodes have two network interfaces. The first one is configured using a 192.168.122.0/24 subnet and connected to the internet. The second one is not configured.
  • Both nodes are installed with a Rocky Linux 9 (9.5 as of today) minimal installation

We’ll add all the bits and pieces to this graphic while we’ll be installing and configuring the complete stack, don’t worry.

That’s it for the introduction. In the next post we’ll prepare the two nodes so we can continue to install and configure the OpenStack services on top of them.

L’article Creating your private cloud using OpenStack – (1) – Introduction est apparu en premier sur dbi Blog.

Introducing YaK 2.0: The future of effortless PaaS deployments across Clouds and On-Premises

Wed, 2025-01-15 01:13
YaK 2.0 Automated multi-cloud PaaS deployment

Hello, dear tech enthusiasts and cloud aficionados!

We’ve got some news that’s about to make your life —or your deployments, at least— a whole lot easier. Meet YaK 2.0, the latest game-changer in the world of automated multi-cloud PaaS deployment. After months of development, testing, troubleshooting, a fair share of meetings and way too much coffee, YaK is officially launching today.

Automated multi-cloud PaaS deployment. What’s the deal with YaK 2.0?

Because we believe IT professionals should spend time on complex, value-added tasks, but not on repetitive setups, we have decided to develop the YaK.
YaK is a framework that allows anyone to deploy any type of component on any platform, while ensuring quality, cost efficiency and reducing deployment time.

YaK 2.0 is your new best friend when it comes to deploying infrastructure that’s not just efficient but also identical across every platform you’re working with – be it multi-cloud or on-premises. Originating from the need to deploy multi-technology infrastructures quickly and effortlessly, YaK ensures your setup is consistent, whether you’re working with AWS, Azure, Oracle Cloud, or your own on-prem servers.

In simpler terms, YaK makes sure your deployment process is consistent and reliable, no matter where. Whether you’re scaling in the cloud or handling things in-house, YaK’s got your back.

Why you should have a look at YaK 2.0?

Here’s why we think YaK is going to become your favorite pet:

  • Flexibility: Deploy across AWS, Azure, OCI, or your own servers—YaK adapts to your infrastructure, making every platform feel like home.
  • Automation: Eliminate repetitive setups with automated deployments, saving you time and headaches.
  • Cost efficiency & speed: YaK cuts time-to-market, streamlining deployments for fast, standardized rollouts that are both cost-effective and secure.
  • Freedom from vendor lock-In: YaK is vendor-neutral, letting you deploy on your terms, across any environment.
  • Swiss software backed up by a consulting company (dbi services) with extensive expertise in deployments.

With this release, we’re excited to announce a major upgrade:

  • Sleek new user interface: YaK 2.0 now comes with a user-friendly interface, making it easier than ever to manage your deployments. Say hello to intuitive navigation.
YaK 2.0 Automated multi-cloud PaaS deployment
  • Components: We’ve got components on our roadmap (available with an annual subscription), and we’ll be announcing them shortly : Oracle Database, PostgreSQL, MongoDB, and Kubernetes are already on the list and will be released soon.

Many more will follow… Stay tuned!

How does it work?

YaK Core is the open-source part and is the heart of our product, featuring Ansible playbooks and a custom plugin that provides a single inventory for all platforms, making your server deployments seamless across clouds like AWS, Azure, and OCI.
If you want to see for yourself, our GitLab project is available here!

YaK 2.0 Automated multi-cloud PaaS deployment

YaK Components are the value-added part of the product and bring you expert-designed modules for deploying databases and application servers, with an annual subscription to dbi services.

Join the YaK pack

Explore the power of automated multi-cloud PaaS deployment with YaK 2.0 and experience a new level of efficiency and flexibility. We can’t wait for you to try it out and see just how much it can streamline your deployment process. Whether you’re a startup with big dreams or an established enterprise looking to optimize, YaK is here to make your life easier.

Our YaK deserved its own web page, check it out for more information, to contact us or to try it out (free demo environments will be available soon): yak4all.io

Wanna ride the YaK? Check out our user documentation to get started!
We promise it’ll be the smoothest ride you’ve had in a while.

We’re not just launching a product; we’re building a community. We’d love for you to chime in, share your experiences, and help us make YaK even better. Follow us on LinkedIn, join our community on GitLab, and let’s create something amazing together.

Feel free to reach out to us for more details or for a live presentation: info@dbi-services.com

Thanks for being part of this exciting journey. We can’t wait to see what you build with YaK.

The YaK Team

P.S. If you’re wondering about the name, well, yaks are known for being hardy, reliable, and able to thrive in any environment. Plus, they look pretty cool, don’t you think?

L’article Introducing YaK 2.0: The future of effortless PaaS deployments across Clouds and On-Premises est apparu en premier sur dbi Blog.

New Oracle Database Appliance X11 series for 2025

Tue, 2025-01-14 15:22
Introduction

Oracle Database Appliance X10 is not so old, but X11 is already out, available to order.

Let’s find out what’s new for this 2025 series.

What is an Oracle Database Appliance?

ODA, or Oracle Database Appliance, is an engineered system from Oracle. Basically, it’s an x86-64 server with a dedicated software distribution including Linux, Oracle Grid Infrastructure (GI) including Automatic Storage Management and Real Application Cluster, Oracle database software, a Command Line Interface (CLI), a Browser User Interface (BUI) and a virtualization layer. The goal being to simplify database lifecycle and maximize performance. Market position is somewhere between OCI (the Oracle public Cloud) and Exadata (the highest level engineered system – a kind of big and rather expensive ODA). For most clients, ODA brings both simplification and performance they just need. For me, ODA has always been one of my favorite solutions, and undoubtedly a solution to consider. X11 doesn’t change the rules regarding my recommendations.

To address a large range of clients, ODA is available in 3 models: S, L and HA.

For Enterprise Edition (EE) users, as well as for Standard Edition 2 (SE2) users, ODA has a strong advantage over its competitors: capacity on demand licensing. With EE you can start with 1x EE processor license (2 enabled cores). With SE2 you can start with 1x SE2 processor license (8 enabled cores). You can later scale up by enabling additional cores according to your needs.

On the processor side

X11 still rely on Epyc series for its processor, according to Oracle recent long-term commitment to AMD.

Is the X11 CPU better than X10 ones? According to data sheets, ODA moves from Epyc 9334 to Epyc 9J15. This latest version may be specific to Oracle as it doesn’t appear on the AMD website. Looking at the speed, Epyc 9334 is clocked from 2.7Ghz to 3.9GHz, and Epyc 9J15 is clocked from 2.95Ghz to 4.4Ghz. As a consequence, you should probably expect a 10% performance increase per core. Not a huge bump, but X10 was quite a big improvement over X9-2 Xeon processors. Each processor has 32 cores, and there is still 1 processor on X11-S and 2 on X11-L. As X11-HA is basically two X11-L without local disks but connected to a disk enclosure, each node also have 2 Epyc processors.

Having a better CPU does mean better performance, but also less processor licenses needed for the same workload. It’s always something to keep in mind.

RAM and disks: same configuration as outgoing X10

Nothing new about RAM on X11, the same configurations are available, from 256GB on X11-S, and from 512GB on X11-L and each node of the X11-HA. You can double or triple the RAM size if needed on each server.

On X11-S and L models, data disks have the same size as X10 series: 6.8TB NVMe disks. X11-S has the same limitation as X10-S, only 2 disks and no possible expansion.

X11-L also comes with 2 disks, but you can add pairs of disks up to 8 disks, meaning 54TB of RAW storage. Be aware that only 4 disk slots are available on the front panel. Therefore, starting from the third pair of disks, disks are different: they are Add-In-Cards (AIC). It means that you will need to open your server to add or replace these disks, with a downtime for your databases.

X11-HA is not different compared to X10-HA, there is still a High Performance (HP) version and a High Capacity (HC) version, the first one being only composed of SSDs, the second one being composed of a mix of SSDs and HDDs. SSDs are 7.68TB each, and HDDs are 22TB each.

Network interfaces

Nothing new regarding network interfaces. You can have up to 3 of them (2 are optional), and you will choose for each between a quad-port 10GBase-T (copper) or a two-port 10/25GbE (SFP28). You should know that SFP28 won’t connect to 1Gbps fiber network. But using SFPs for a network limited to 1Gbps would not make sense.

Software bundle

Latest software bundle for ODA is 19.25, so you will use this latest one on X11. This software bundle is also compatible with X10, X9-2, X8-2 and X7-2 series. This bundle is the same for SE2 and EE editions.

What are the differences between the 3 models?

The X11-S is an entry level model for a small number of small databases.

The X11-L is much more capable and can get disk expansions. A big infrastructure with hundreds of databases can easily fit on several X10-L.

The X11-HA is for RAC users because High Availability is included. The disk capacity is much higher than single node models, and HDDs are still an option. With X11-HA, big infrastructures can be consolidated with a very small number of HA ODAs.

ModelDB EditionnodesURAMRAM maxRAW TBRAW TB maxbase priceODA X11-SEE and SE212256GB768GB13.613.624’816$ODA X11-LEE and SE212512GB1536GB13.654.440’241$ODA X11-HA (HP)EE and SE228/122x 512GB2x 1536GB46368112’381$ODA X11-HA (HC)EE and SE228/122x 512GB2x 1536GB390792112’381$

You can run SE2 on X11-HA, but it’s much more an appliance dedicated to EE clients.
I’m not so sure that X11-HA still makes sense today compared to Exadata Cloud@Customer: study both options carefully if you need this kind of platform.

Is X11 more expensive than X10?

In the latest engineered systems price list (search exadata price list and you will easily find it), you will see X11 series alongside X10 series. Prices are the same, so there is no reason to order the old ones.

Which one should you choose?

If your databases can comfortably fit on the storage of the S model, don’t hesitate as you will probably never need more.

Most interesting model is still the new X11-L. L is quite affordable, has a great storage capacity, and is upgradable if you don’t buy the full system at first.

If you still want/need RAC and its associated complexity, the HA may be for you but take a look at Exadata Cloud@Customer and compare the costs.

Don’t forget that you will need at least 2 ODAs for Disaster Recovery purpose, using Data Guard (EE) or Dbvisit Standby (SE2). No one would recommend buying a single ODA. Mixing S and L is OK, but I would not recommend mixing L and HA ODAs just because some operations are handled differently when using RAC.

I would still prefer buying 2x ODA X11-L compared to 1x ODA X11-HA. NVMe speed, no RAC and the simplicity of a single server is definitely better in my opinion. Extreme consolidation is not always the best solution.

Conclusion

ODA X11 series is a slight refresh of X10 series, but if you were previously using older generations (for example X7-2 that comes to end of life this year) switching to X11 will make a significant difference. In 2025, ODA is still a good platform for database simplification and consolidation. And it’s still very popular among our clients.

Useful links

X11-S/L datasheet

X11-HA datasheet

SE2 licensing rules on ODA X10 (apply to X11)

Storage and ASM on ODA X10-L (apply to X11)

L’article New Oracle Database Appliance X11 series for 2025 est apparu en premier sur dbi Blog.

PostgreSQL 18: Change the maximum number of autovacuum workers on the fly

Wed, 2025-01-08 05:58

For PostgreSQL it is critical, that autovacuum is able to keep up with the changes to the instance. One of the parameters you can adapt for this is autovacuum_max_workers. This parameter controls how many worker process can be started in parallel by the autovacuum launcher process. By default, this is set to 3, which means that a maximum of three worker processes can run in parallel to do the work. While you can increase this parameter easily, it requires a restart of instance to become active. Starting with PostgreSQL 18 (scheduled to be released later this year), you’ll be able to change this on the fly.

The default configuration for PostgreSQL 18 is still three worker processes:

postgres=# select version();
                                version                                 
------------------------------------------------------------------------
 PostgreSQL 18devel on x86_64-freebsd, compiled by clang-18.1.6, 64-bit
(1 row)

postgres=# show autovacuum_max_workers ;
 autovacuum_max_workers 
------------------------
 3
(1 row)

But now, you can increase that on the fly without restarting the instance (sighup means reload):

postgres=# select context from pg_settings where name = 'autovacuum_max_workers';
 context 
---------
 sighup
(1 row)

postgres=# 

On a PostgreSQL 17 (and before) instance, it looks like this (postmaster means restart):

postgres=# select version();
                                                                                           version                                                                                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 17.2 on x86_64-unknown-freebsd14.2, compiled by FreeBSD clang version 18.1.6 , 64-bit
(1 row)

postgres=# select context from pg_settings where name = 'autovacuum_max_workers';
  context   
------------
 postmaster
(1 row)

So, lets assume we have an instance where a lot of stuff is going on and we want to increase the worker processes to 32 on a PostgreSQL 18 instance:

postgres=# alter system set autovacuum_max_workers = 32;
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# show autovacuum_max_workers ;
 autovacuum_max_workers 
------------------------
 32
(1 row)

That seems to have worked, but if you take a look at the PostgreSQL log file you’ll notice this:

2025-01-08 11:48:59.504 CET - 1 - 4411 - [local] - postgres@postgres - 0LOG:  statement: alter system set autovacuum_max_workers = 32;
2025-01-08 11:49:05.210 CET - 8 - 4174 -  - @ - 0LOG:  received SIGHUP, reloading configuration files
2025-01-08 11:49:05.212 CET - 9 - 4174 -  - @ - 0LOG:  parameter "autovacuum_max_workers" changed to "32"
2025-01-08 11:49:05.221 CET - 1 - 4180 -  - @ - 0WARNING:  "autovacuum_max_workers" (32) should be less than or equal to "autovacuum_worker_slots" (16)

This means there still is a limit, which is defined by “autovacuum_worker_slots“, and the default for this one is 16. For most instances this probably is fine, you can go from the default (3) anywhere up to 16 without restarting the instance. If you think you’ll need more from time to time, then you also should increase autovacuum_worker_slots, but this does require a restart of the instance.

Here the link to the commit.

L’article PostgreSQL 18: Change the maximum number of autovacuum workers on the fly est apparu en premier sur dbi Blog.

M-Files Ment: no code document automation solution in practice

Mon, 2025-01-06 09:49

Who has never experienced repetitive tasks writing documents based on similar layout content, making sure it is compliant. Don’t you think this activity is using a lot of your precious knowledge worker time and often include small errors or typos?

Let’s see how M-Files Ment solution will help you to achieve this goal assuming your environment has been properly installed and set up (take a look here)

1. “Hiring contract” use case

In order to edit, send, sign and register a brand-new hiring contract to a future new employee/recruit, C-Management and HR Administration teams have to create such document and fill in several information and data, according to candidate residence, subsidiary and so on.

Prior entering into this procedure, applicant will have already completed all prerequisite job interviews and sent his CV to the company accordingly. This part is out of the scope of this blog but it could of course be managed with dedicated M-Files workflow.

Hence, as soon as C-Management team members agree to pursue with the chosen one, they informed HR Administration responsible team member to create a new hiring contract with all needed applicant and job information.

During Word document draft, HR Administration responsible team member will fill in below information in applicant mother tongue language, say Swiss French or Swiss German for simplicity:

  • Applicant Name
  • Applicant Address
  • Applicant Email Address
  • Job title category (such as Back office, Consultant or SD-Consultant)
  • Subsidiaries incorporation
2. Welcome to M-Files Ment

As soon as you reach your M-File Ment company welcome page, depending on your user profile, you have the choice to choose between different menus. At most, with “Admin” user rights yow will get access to all below functionalities:

Additionally, an extra “Videos” menu is provided so you can watch quick short videos explaining how to handle the different software features.

3. Automate template 3.1 New template settings

As you can guess, let’s start by the beginning and automate a new template

Fill in the different fields. Some are mandatory such as “Template name” (clear enough), “Document styles” (see section 3.11 for details) and “Available for”

Let’s set “Template Name” field to “Hiring Contract – English Question(s)” in our example.
It is possible to categorize your template using specific “Category” keyword(s) so you can filter based on these categories and retrieve the right template when you want to generate new document(s). Similarly, you can also tag your template to make your search in archives easier.

Available for” field is used to allow which Ment user profile (Admin, Author, Manager or User) can use the template:

Below a slight description helping to understand what this template relates to

You may have observed that you can “Use M-Files vault as the datasource” to rely on its content to achieve your purpose. We are using “dbi – Ment” vault where several object types, classes and properties were created.

3.2 Document styles

As soon as you will start creating a template, you will have to use a dedicated document “Style”. Depending on your needs and most probably Microsoft Word document text format content you are targeting, you will either use “Basic 2.0” default one or have to create your own.

See “Basic 2.0” example:

Amend the different elements of your “Document style” from the right hand side drop down menu like headings and paragraphs and check the preview reflecting the changes you made on the left. Don’t forget to save your changes hence documents that use your style will be modified accordingly. It is entirely possible to change a document style after you originally chose one during the first uploading template step creation. Moreover, if you create a template with several automated documents. it is also possible to use specific “Document styles” for each of them:

At this stage, we need to look at M-Files “dbi – Ment” vault quickly to avoid any confusion about objects we will interact with.

4. M-Files “dbi – Ment” vault objects

For the exercise, we will use “Sandra Applicant” person object with below metadata properties

And dbi services Delémont Office metadata properties

All above objects have been set in M-Files vault by accountable user as prerequisite.
Now that everything is in order, let’s go back to M-Files Ment and complete our task.

5. “Hiring contract” template creation 5.1 Import document text content source

Set a document name.

Then you can either choose to upload your Word .docx file source.

Or click on “Add document” and start to write down directly your text.

Let’s choose this second option and add all our necessary document text content.
Remember, at the end, the result is to produce either a Swiss French content version or a Swiss German one.

Hence, the order of appearance is not really important at all since this will be managed later through block automation next steps.
Save your changes.
Before we move on, you have certainly observed that our style is not yet applied.
To do so, we need to select each text element (title, header, table, lists)

and use the proper menu corresponding to our needs (adjusting fonts and margins accordingly).
Example with “Heading 1”:

Once this action completed, you can preview your document selecting “Preview” button to visualize your final document style version:

Let’s assume our format is completed successfully for both sections

Now here comes the power of Ment.

5.2 One, two, three – add automation

All automations are done in three steps:

  • Select the text and automation
  • Define the question
  • Combine the text and the answer

These stand in below menu:

Free text field(s)” give the users the possibility to generate any text to the document.
Inline automation” don’t give the users the possibility to generate any text, but to select between the options you have created.
“Block automation” give the users the possibility to select between the options you have created

5.3 Automation examples

Let’s define a block automation selecting the whole Swiss German text section first

Click on “Block automation” menu

A new panel opens requesting the author to define a question leading, according to answer(s) definition, to fill in the document target content.

Select “Add a new question” and the type corresponding to your needs

Finally click “Next” and combine the answer to the text selected

Then “Done”, and here you are

Repeat these steps with the second Swiss French section to combine the second answer of this first question. But this time, we will “choose an existing question”

and select the corresponding answer “FR” to combine text section previously selected

Then “Done”. In Ment author interface, the result is presenting now with 2 yellow blocks bounded to one question “Hiring contract language version”, used two times as text generated in targeted document will be either in French or German.

Click on “Preview” to visualize and test your automated document

From here, I’m pretty sure you are starting to perceive the growth potential of this tool and hopefully that’s just a beginning.
Let’s carry on with a “Free text field(s)” automation. This time, we will rely on M-Files vault metadata introduced in section 4.

Select the text “Person Full Name” and “Free text field(s)” automation

Select “dbi – Ment” M-Files vault data source where our metadata are located

Add a new question. The tip here is to avoid creating repetitive questions to fulfill different part of the document. Instead, anticipate and group answer elements to update these information accordingly all in one. In this example, job applicant (Person) “Address Full Name” will be set and provided with his/her “Person Full Name”.

Select “Person” M-Files vault object class

Choose “Person Full Name” property

And additionally “Address Full Name” property

Click “Next”.
Finally, combine the answer to the text field originally selected

click “Done”.
Half part of this “Select Applicant / Person details” question completed.

Repeat this procedure for “Address Full Name” as mentioned before and choose an existing question (you guessed in advance which one)

And “Done”.
As a result, both fields are now automated as expected. Notice question / automation usage incrementation

Now, if you remember, we would like to have these information as well in Swiss French version of this document. Then let’s go for it. Not a big deal since a simple copy/paste of our automated fields should accomplish the job.

Do not forget to save your changes. Notice again question / automation usage incrementation.

All good. What about the preview of these automation ?

  • Swiss German version
  • Swiss French version

At the end, in order to let Ment users generate a document based on one template, an admin or the author has to publish it.

All these steps let you foresee what you can achieve with M-Files Ment.

6. What’s next

This brief introduction to M-Files Ment shows you what an author or admin can do to automate document generation based on template(s) for knowledge workers.
This is not all. Combining these document templates to M-Files workflows can leverage your Business organization to achieve and automate complex tasks with finest controls and results.
I encourage you to watch our recordings (FR version or DE version).
Enjoy watching and do not hesitate to contact us for any project integration or support.

L’article M-Files Ment: no code document automation solution in practice est apparu en premier sur dbi Blog.

Documentum – Login through OTDS without oTExternalID3

Wed, 2025-01-01 09:00

As you might know, Documentum “deprecated” (in reality disabled completely) the different Authentication Plugins that were bundled with a Documentum Server. That means that with recent versions of Documentum, you cannot login to your LDAP-managed account anymore without having configured an OTDS and integrated it with your Documentum Server. After you installed the OTDS, and configured it to work with Documentum, you might be faced with an annoying behavior that makes it impossible to login. This is because, by default, it only supports one specific configuration for the user_login_name (i.e. oTExternalID3). There is a workaround, but it’s not documented, as far as I know, so I’m writing this blog to share that information.

When logging in to a Documentum Server, using the “connect” iAPI command, the Repository will verify if the user_login_name exists. If yes, it will send the Authentication request to the JMS, which will contact the OTDS with the details provided. The OTDS will perform the authentication with whatever Identity Provider you configured inside it and return the result to the JMS, which will then confirm the details to the Repository to either allow or deny the login. In this case, it doesn’t matter if the user_source of the dm_user is configured with “LDAP” or “OTDS”. Both will behave in the same way and the request will be sent to the JMS and then the OTDS. That’s the theory, but there are some bug / caveats that I might cover in another blog.

I. OTDS Synchronization with default configuration

To do some testing or if you are setting-up a freshly installed Documentum Repository (i.e. no previous LDAP integrations), you might want to keep things simple and therefore you would most probably end-up using the default configuration.

The default User Mapping configuration for an OTDS Resource, for Documentum, might be something like:

    Resource Attribute            >> OTDS Attribute          >> Format
    __NAME__                      >> cn                      >> %s
    AccountDisabled               >> ds-pwp-account-disabled >> %s
    client_capability             >>                         >> 0
    create_default_cabinet        >>                         >> F
    user_address                  >> mail                    >> %s
    user_global_unique_id         >> oTObjectGUID            >> %s
    user_login_name               >> oTExternalID3           >> %s
    user_name                     >> cn                      >> %s
    user_privileges               >>                         >> 0
    user_rename_enabled           >>                         >> F
    user_rename_unlock_locked_obj >>                         >> T
    user_type                     >>                         >> dm_user
    user_xprivileges              >>                         >> 0

Please note that the default value for “user_login_name” is “oTExternalID3”. In addition to mapped attributes from the AD / LDAP, OTDS defines some internal attributes that you can use, and this one is one of those. For example, if a cn/sAMAccountName has a value of “MYUSERID”, then:

  • oTExternalID1 == MYUSERID
  • oTExternalID2 == MYUSERID@OTDS-PARTITION-NAME
  • oTExternalID3 == MYUSERID@DOMAIN-NAME.COM
  • oTExternalID4 == DOMAIN\MYUSERID

Therefore, in this case, with the default configuration, you would need to use “MYUSERID@DOMAIN-NAME.COM” to be able to login to Documentum. Nothing else would work as your dm_user would be synchronized/created/modified to have a user_login_name value of “MYUSERID@DOMAIN-NAME.COM”. As a sidenote, the “%s” in the Format column means to keep the formatting/case from the source attribute. In most AD / LDAP, the cn/sAMAccountName would be in uppercase, so you would only be able to login with the uppercase details. There is a parameter that you can set in the server.ini to be able to have a case-insensitive Repository and another one in the JMS, so you might want to take a look at that for example.

Here, I’m setting an AD password in an environment variable and then fetching a dm_user details to show you the current content, before triggering a login attempt (using the “connect” iAPI command):

[dmadmin@cs-0 logs]$ read -s -p "  --> Please enter the AD Password: " ad_passwd
  --> Please enter the AD Password:
[dmadmin@cs-0 logs]$
[dmadmin@cs-0 logs]$
[dmadmin@cs-0 logs]$ iapi REPO_NAME -Udmadmin -Pxxx << EOC
> retrieve,c,dm_user where upper(user_login_name) like 'MYUSERID%'
> get,c,l,user_name
> get,c,l,user_login_name
> EOC

        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2020. OpenText Corporation
        All rights reserved.
        Client Library Release 20.2.0000.0082

Connecting to Server using docbase REPO_NAME
[DM_SESSION_I_SESSION_START]info:  "Session 011234568006fe39 started for user dmadmin."

Connected to OpenText Documentum Server running Release 20.2.00013.0135  Linux64.Oracle
Session id is s0
API> ...
1112345680001d00
API> ...
MYUSERID
API> ...
MYUSERID@DOMAIN-NAME.COM
API> Bye
[dmadmin@cs-0 logs]$
[dmadmin@cs-0 logs]$
[dmadmin@cs-0 logs]$ iapi REPO_NAME -Udmadmin -Pxxx << EOC
> apply,c,NULL,SET_OPTIONS,OPTION,S,trace_authentication,VALUE,B,T
> connect,REPO_NAME,MYUSERID@DOMAIN-NAME.COM,dm_otds_password=${ad_passwd}
> apply,c,NULL,SET_OPTIONS,OPTION,S,trace_authentication,VALUE,B,F
> EOC

        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2020. OpenText Corporation
        All rights reserved.
        Client Library Release 20.2.0000.0082

Connecting to Server using docbase REPO_NAME
[DM_SESSION_I_SESSION_START]info:  "Session 011234568006fe40 started for user dmadmin."

Connected to OpenText Documentum Server running Release 20.2.00013.0135  Linux64.Oracle
Session id is s0
API> ...
q0
API> ...
s1
API> ...
q0
API> Bye
[dmadmin@cs-0 logs]$

As you can see above, the result of the “connect” command is “s1”, which means the session is opened and Documentum was able to verify through the OTDS that the login is correct. On the JMS, there is an “otdsauth.log” file, that gives you this kind of information (might give a bit more information depending on the Documentum Server version used):

[dmadmin@cs-0 logs]$ cat otdsauth.log
...
2025-01-01 13:37:26,417 UTC DEBUG [root] (default task-6) In com.documentum.cs.otds.OTDSAuthenticationServlet
2025-01-01 13:37:26,780 UTC DEBUG [root] (default task-6) userId: MYUSERID@DOMAIN-NAME.COM
2025-01-01 13:37:26,782 UTC DEBUG [root] (default task-6) Password Auth Success: MYUSERID@DOMAIN-NAME.COM
[dmadmin@cs-0 logs]$

The Repository logs will also show the trace_authentication details and the OTDS will also have a successful authentication attempt in its logs. So, all is well in a perfect world, right?

II. OTDS Synchronization with updated configuration

When working with an existing Repository that was initially setup with LDAP Sync and Auth, you might have a “simple” configuration that defined that the user_login_name would be the cn/sAMAccountName attribute from the Active Directory. In this case, you probably don’t want to change anything after the integration of the OTDS… After all, the OTDS is supposed to simplify the configuration and not complexify it. Therefore, you would setup the OTDS to integrate (Synchronized Partition or Non-Synchronized one) with your AD / LDAP and then create a Resource that would replicate and match the exact details of your existing users. Even on a freshly installed Repository without previous LDAP integration, you might choose to login with “MYUSERID” (or “myuserid”) instead of “MYUSERID@DOMAIN-NAME.COM”. The OTDS will allows you to configure that, so users can be synchronized to Documentum however you want.

To achieve that, you would need to change a bit the User Mapping configuration to keep your previous login information / avoid messing with the existing dm_user details. For example, you might want to change the client_capability, user_login_name, user_name and some other things. Here is an example of configuration that allows you to synchronize the users with the cn/sAMAccountName from your AD / LDAP, in lowercase, please note the changes with a wildcard (*):

    Resource Attribute            >> OTDS Attribute          >> Format
    __NAME__                      >> cn                      >> %l (*)
    AccountDisabled               >> ds-pwp-account-disabled >> %s
    client_capability             >>                         >> 2 (*)
    create_default_cabinet        >>                         >> F
    user_address                  >> mail                    >> %s
    user_global_unique_id         >> oTObjectGUID            >> %s
    user_login_name               >> cn (*)                  >> %l (*)
    user_name                     >> displayName (*)         >> %s
    user_privileges               >>                         >> 0
    user_rename_enabled           >>                         >> T (*)
    user_rename_unlock_locked_obj >>                         >> T
    user_type                     >>                         >> dm_user
    user_xprivileges              >>                         >> 32 (*)

The documentation mention in some places to have the same value for both _NAME_ and for user_name but I’m not sure if that’s really required, as I have some customers with different values, and it works anyway. It’s pretty common for customers to have the same value for cn and sAMAccountName and to store the displayName into, well, the displayName attribute… On Documentum side, some customers will use cn as the user_name, but some others will use displayName instead. The user_name is, after all, a kind of displayName so I don’t really understand why OTDS would require both _NAME_ and user_name to be the same. It should instead rely on the user_login_name, no?

After consolidating the OTDS Resource, you should be able to see the correct user_login_name as it was before (with the LDAP Sync job). What’s the purpose of this blog then? Well, the OTDS allows you to change the mapping as you see fit, so that you can replicate exactly what you used to have with an LDAP Sync. But you cannot login anymore…

After the modification of the OTDS Resource User Mapping and its consolidation, here I’m trying to login again (with “myuserid” instead of “MYUSERID@DOMAIN-NAME.COM”) to show the difference in behavior:

[dmadmin@cs-0 logs]$ iapi REPO_NAME -Udmadmin -Pxxx << EOC
> retrieve,c,dm_user where upper(user_login_name) like 'MYUSERID%'
> get,c,l,user_name
> get,c,l,user_login_name
> EOC

        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2020. OpenText Corporation
        All rights reserved.
        Client Library Release 20.2.0000.0082

Connecting to Server using docbase REPO_NAME
[DM_SESSION_I_SESSION_START]info:  "Session 011234568006fe48 started for user dmadmin."

Connected to OpenText Documentum Server running Release 20.2.00013.0135  Linux64.Oracle
Session id is s0
API> ...
1112345680001d00
API> ...
LastName (Ext) FirstName
API> ...
myuserid
API> Bye
[dmadmin@cs-0 logs]$
[dmadmin@cs-0 logs]$ iapi REPO_NAME -Udmadmin -Pxxx << EOC
> apply,c,NULL,SET_OPTIONS,OPTION,S,trace_authentication,VALUE,B,T
> connect,REPO_NAME,myuserid,dm_otds_password=${ad_passwd}
> apply,c,NULL,SET_OPTIONS,OPTION,S,trace_authentication,VALUE,B,F
> EOC

        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2020. OpenText Corporation
        All rights reserved.
        Client Library Release 20.2.0000.0082

Connecting to Server using docbase REPO_NAME
[DM_SESSION_I_SESSION_START]info:  "Session 011234568006fe4f started for user dmadmin."

Connected to OpenText Documentum Server running Release 20.2.00013.0135  Linux64.Oracle
Session id is s0
API> ...
q0
API> ...
[DM_SESSION_E_AUTH_FAIL]error:  "Authentication failed for user myuserid with docbase REPO_NAME."


API> ...
q1
API> Bye
[dmadmin@cs-0 logs]$

This time the authentication fails. If you look at the Repository logs, you can see the user is detected properly, and the Repository start the authentication with the OTDS (1st line below). But when the result comes back (2nd and 3rd lines below), it says that it failed:

2025-01-01T13:46:16.446426      188808[188808]  011234568006fe50        [AUTH]  Start-AuthenticateUserByOTDSPassword:UserLoginName(myuserid)
2025-01-01T13:46:16.815111      188808[188808]  011234568006fe50        [AUTH]  otds_password_authentication = false:
2025-01-01T13:46:16.815159      188808[188808]  011234568006fe50        [AUTH]  End-AuthenticateUserByOTDSPassword: 0
2025-01-01T13:46:17.174676      188808[188808]  011234568006fe50        [AUTH]  Final Auth Result=F, LOGON_NAME=myuserid, ...

The JMS otdsauth.log file will have a similar content, it will start the OTDS communications (1st line below) but the result returned (2nd line below) is not the user_login_name of Documentum. Instead, it’s the value of oTExternalID3 and then the JMS says that it failed (3rd line below):

2025-01-01 13:46:16,671 UTC DEBUG [root] (default task-6) In com.documentum.cs.otds.OTDSAuthenticationServlet
2025-01-01 13:46:16,813 UTC DEBUG [root] (default task-6) userId: MYUSERID@DOMAIN-NAME.COM
2025-01-01 13:46:16,814 UTC DEBUG [root] (default task-6) Password Auth Failed: myuserid

On the OTDS side, no problems, the authentication was successful when it was received (in the directory-access.log):

2025-01-01 13:46:16.777|INFO  ||0|0|Authentication Service|Success Access|27,Initial authentication successful|172.0.0.10|""|OTDS-PARTITION-NAME|"MYUSERID@DOMAIN-NAME.COM"|"Authentication success: MYUSERID@DOMAIN-NAME.COM using authentication handler OTDS-PARTITION-NAME for resource __OTDS_AS__"

If you look at the exact timestamp of the messages, you see the exact flow of how things went. In short, the OTDS says that it’s OK and it sends back some information to the JMS. But because the information returned is oTExternalID3, there is a mismatch with the value of the user_login_name and the JMS/Repository then concludes that the authentication failed, which isn’t true…

Therefore, using any user_login_name value other than oTExternalID3 isn’t a problem from a synchronization point of view, but you still cannot login anyway.

III. Workaround

As mentioned in the introduction of this blog, there is a workaround, which is to set the parameter “synced_user_login_name=sAMAccountName” in the otdsauth.properties file that configures how the JMS talks to the OTDS. I looked at all the OTDS and Documentum documentations, for several versions, as well as KBs, but I couldn’t find this workaround mentioned anywhere. Maybe I’m the one that doesn’t know how to search (don’t blame the search from OT Support website :D). The one and only reference to this parameter is in the Documentum Server Admin & Config doc, but it tells you that it’s optional and it’s only for OTDS token-based authentication. Here, we are doing a password-based auth, we don’t have any OTDS oAuth Client ID/Secret, so this section shouldn’t be required at all. You don’t need the other parameters from this section, but you DO need “synced_user_login_name”, if you would like to login with the cn/sAMAccountName/oTExternalID1/oTSAMAccountName parameter.

However, there is an additional catch… The parameter was apparently only introduced in 20.3. For any older Documentum Server, you will need to check with OT if they have a fix available. I know there is one for 20.2, but it’s only for Windows (c.f. here). Now, you know that you can also use this parameter for that purpose.

L’article Documentum – Login through OTDS without oTExternalID3 est apparu en premier sur dbi Blog.

Documentum – Silent Install – OTDS

Mon, 2024-12-30 03:53

A bit more than 6 years ago (already), I wrote a series of blogs about the silent install of some standard Documentum component. It included things from the binaries, the Docbroker (Connection Broker), the Repositories (both Primary and Remote), D2 and finally xPlore (binaries, Dsearch and IndexAgent). As of today, the principle is still the same for these components. Of course, the exact list of parameters might have changed a bit for some of them, like new parameters being added, but in essence, it’s still accurate.

I also worked with the silent install of several other components such as the Documentum BPM / xCP, the Life Sciences Suite (LSTMF / LSQM / LSRD / LSSSV or the complete LSS bundle) or the IDS (Interactive Delivery Services) but I didn’t write blogs about it. If you would be interested, don’t hesitate to ask and I will see if I can write something about it. In this one, I will share my view on the Silent install process of the OTDS (OpenText Directory Services), as it’s the new standard for user management in OpenText products.

I. Properties file

So, let’s start right away with the preparation of the properties file:

[tomcat@otds_01 ~]$ vi /tmp/otds_silent.properties
[tomcat@otds_01 ~]$ cat /tmp/otds_silent.properties
[Setup]
Id=OTDS
Version=24.4.0.4503
Patch=0
Basedir=/opt/workspace
Configfile=/opt/workspace/setup.xml
Action=Install
Log=/opt/workspace/otds-installer.log
Instance=1
Feature=All

[Property]
INST_GROUP=tomcat
INST_USER=tomcat
INSTALL_DIR=/app/app_data/otds
TOMCAT_DIR=/app/tomcat
PRIMARY_FQDN=otds-0.domain.com
ISREPLICA_TOPOLOGY=0
IMPORT_DATA=0
OTDS_PASS=otdsAdm1nP4ss+w0rd
ENCRYPTION_KEY=
MIGRATION_OPENDJ_URL=
MIGRATION_OPENDJ_PASSWORD=otdsAdm1nP4ss+w0rd
JDBC_CONNECTION_STRING=jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=db1_hostname.domain.com)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=otds_svc.domain.com)))
JDBC_USERNAME=OTDS
JDBC_PASSWORD=jdbcP4ss+w0rd
[tomcat@otds_01 ~]$
II. Setup section

First, there is a “Setup” section in the silent install properties file, with values that aren’t really related to the OTDS installation but more about the current environment / binaries to be used. In any cases, here is a short description of the “Setup” section properties:

  • Id: The component to be installed, should match the value of “Id” from the “setup.xml” file
  • Version: Matches the value of “Version” from the “setup.xml” file
  • Patch: Matches the value of “Patch” from the “setup.xml” file
  • Basedir: The path of the base folder where the OTDS installation package (e.g. OTDS-2440-LNX.tar) has been extracted to. This path needs to contain the folders “lib”, “OTDS” and “tools”
  • Configfile: The path of the “setup.xml” file that contains the details of what needs to be done
  • Action: Either “Install” to install a new OTDS where there is none, “Upgrade” to upgrade an existing version to a newer one or “Modify” to uninstall the local OTDS (needs to be using the same version that was installed, you cannot uninstall using newer binaries)
  • Log: The path of the log file that will be generated by the execution of the installer
  • Instance: Standard OT installer details, not really used by OTDS and mostly for the Windows installer part apparently.
  • Feature: Standard OT installer details, not really used by OTDS and mostly for the Windows installer part apparently. OTDS doesn’t support anything else than “All”

I’m not sure why we would need to specify both the Basedir AND Configfile as the “setup.xml” will be present inside the Basedir if you took the OTDS installation package. Maybe it’s just a convenience for OpenText, to be able to use the same installer/binaries with multiple “setup.xml” files, but anyway… Same thing for the “Id”, “Version” and “Patch” parameters, all these details can be found inside the “setup.xml” file, for which we define the Configfile parameter. Therefore, the OTDS installer could just fetch these details by itself from the setup.xml file. I guess it’s not smart enough to do that, so you need to do it by yourself otherwise the installer will complain about it.

III. Property section

Then, there is a “Property” section in the silent install properties file, which really controls the OTDS installation details. Here is a short description of the “Property” section properties:

  • INST_GROUP: Name of the OS Group that will be used for group ownership of OTDS files (e.g.: primary group of the user running the Tomcat process)
  • INST_USER: Name of the OS User that will be used for user ownership of OTDS files (e.g.: user running the Tomcat process)
  • INSTALL_DIR: The path of the base folder where OTDS will put its data. That will include the OTDS config folder (with “otds.properties”) and the different application files (WEB-INF, META-INF and other jsp/js/html files). When installing OTDS like that, there won’t be any WAR files deployed on the webapps folder of Tomcat. The WAR content will be exploded/extracted and available in this INSTALL_DIR location and Tomcat will run these web applications by loading their “Context” inside the $CATALINA_HOME/conf/Catalina/localhost/otds*.xml files
  • TOMCAT_DIR: The path of the Tomcat install base, which is usually $CATALINA_HOME
  • PRIMARY_FQDN: The Fully Qualified Domain Name of the local host. Even if it says Primary, in case of replicated setup, you can set the local host only, as the HA setup is handled by the DB itself in recent versions of OTDS (without OpenDJ)
  • ISREPLICA_TOPOLOGY: A value of “0” indicates the installation to be done is a Primary and a value of “1” is for a replica
  • IMPORT_DATA: A value of “0” indicates to NOT import any data (from a previous OTDS version on OpenDJ (and not a Database)) and a value of “1” will ask you further questions about the OpenDJ URL and password to be used
  • OTDS_PASS: Password to be defined for the OTDS admin account. Only asked if it’s for a Primary initial installation (i.e. no previous local install, not a replica and no previous import)
  • ENCRYPTION_KEY: Specify the Data Encryption Key that was used for this environment. Only asked if there is no local install and if it’s a replica or a primary that requires an import. The value to be used here (e.g. in case of replica install) can be found on the Primary installation as the property “directory.bootstrap.CryptSecret” of the OTDS config file ($INSTALL_DIR/config/otds.properties)
  • MIGRATION_OPENDJ_URL: OpenDJ URL to be used to connect to previous version of OTDS in case import is required on a primary installation
  • MIGRATION_OPENDJ_PASSWORD: OpenDJ password to be used to connect to previous version of OTDS in case import is required on a primary installation
  • JDBC_CONNECTION_STRING: Database JDBC Connection String to be used to connect to the OTDS Database
  • JDBC_USERNAME: Database username to be used to connect to the OTDS Database
  • JDBC_PASSWORD: Database password to be used to connect to the OTDS Database

Most parameters are only needed for a fresh local installation and further upgrades will re-use initial properties.

IV. Execution

Once the properties file is ready, you can install the OTDS in silent using the following command:

[tomcat@otds_01 ~]$ /opt/workspace/setup -qbi -rf /tmp/otds_silent.properties

OpenText Directory Services 24.4.0


------------------------------------------------------------------------------
  OpenText Directory Services
------------------------------------------------------------------------------
Installing OpenText Directory Services Component
Please wait .
Installation of OpenText Directory Services Component OK

Installation completed. Results:

OpenText Directory Services Component OK

Installation finished.

[tomcat@otds_01 ~]$

You now know how to do a silent install of OTDS, but that’s only the tip of the iceberg. The next step is then to configure it, as you will need to create the user partitions (synchronized or not), passwords policies, resources, access roles, etc. If you are already familiar with the OTDS, then you can automate the configuration pieces using the REST-API, but that won’t be covered in this blog.

L’article Documentum – Silent Install – OTDS est apparu en premier sur dbi Blog.

M-Files Ment Integration configuration

Fri, 2024-12-27 03:20

In this blog we will be looking at the configuration of M-Files Ment Integration into an M-Files vault. We will discuss the configuration, the benefits and learn how to work with it.
This is a follow-up to the webinar I hosted on 10 December and is aimed at technical teams responsible for configuration, implementation and end-user training.

Overview

The latest version of the M-Files Ment integration supports three distinct use cases:

  • Start document generation out of the M.Files Desktop (installation of the M-Files Ment vault integration is required)
  • Content of M-Files can be inserted in the document generated by M-Files Ment
  • The generated document in M-Files Ment can be directly stored in the M-Files vault

As previously indicated by M-Files, the Ment ecosystem will transition from AWS to Azure in 2025. Furthermore, additional developments and use cases are anticipated in the future. To remain apprised of forthcoming announcements, we recommend following dbi services and M-Files.

Prerequisites

To enable the integration of M-Files Ment in your M-Files vault the following prerequisite are required.

  • M-Files Desktop version 22.5 or later
  • M-Files Server version 22.5 or later
  • Microsoft Entra ID with OAuth must be configured in the M-Files vault
  • M-Files Rest API (with the classic M-Fikles Web) must be configured in the M-Files vault
  • M-Files Rest API must be routable from the Ment IP address for M-Files Ment to retrieve data from the Vault
  • The user accounts must be present in MN-Files vault and Ment instance with the same email address
  • An unique domain name is required for each integrated vault (default for M-Files Cloud vaults)
  • For M-Files on-premisses environment the URL of the vault must be whitelisted in M-Files Ment
How to configure the Entra ID (on-premises environment)

Prior to commencing the configuration process, it is advisable to create a local M-Files account as a recovery account. In the event that the federated authentication settings are not correctly configured, access to the M-Files vault may be lost.

The first thing you need to do is create and configure an application in the Azure Portal. Go to the section titled ‘Microsoft Entra ID’ and select ‘App registration’. You will then need to register a new application. Select the supported account type from the list provided below.

  • Accounts in this organisational directory only
  • Accounts in any organisational directory
  • Accounts in organisational directory and personal Micrososft accounts
  • Personal Microsoft accounts only

Once you have clicked register, the overview page of the application will appear. It is important to note down both the application (client) ID and the directory (tenant) ID.
Go to the authentication section and add a new web platform. Enter the address in the following format for the redirect URI.

https://<VAULT_ADDRESS>/authentication/<CONFIGURATION_NAME>/read
<VAULT_ADDRESS> = address of your vault
<CONFIGURATION_NAME> = this is the name you use for your M-Files configuration

Please be advised that the following URIs must now be added to your web application.

  • https:///authentication//mobile
  • https:///vnext
  • https:///vnext/webuiintegrator.html
  • https:///vnext/msteamsoauth.html
  • https://
  • https:///webuiintegrator.html
  • https:///msteamsoauth.html
  • http://localhost
Example of Entra ID we application configuration How to configure the Entra ID application in M-Files Ment

Please configure your Entra ID application to enable authentication on M-Files Ment. You will need to add the redirect URL, which is https://ment.m-files.com/auth/login?sso=mfilesAzure. Alternatively, you can replace the ‘ment.m-files.com’ domain name with the domain you use in your environment.

You must configure the primary vault connection in M-Files Ment. Log in as an Ment administrator and enter the M-Files vault URL in the configuration tab to issue a token for access. Next, confirm the authentication to the vault. The email address of the M-Files Ment login must match the issuing new token in the Entra ID. Details are shown in the picture below.

Example of the SSO configuration in M-Files Ment How to benefit from the M-Files integration

In this chapter I will explain the different benefits of an M-Files Ment integration in a M-Files vault.

When you create a new template in M-Files Ment, you can define a M-Files vault as the data source and use the data you already have in your M-Files vault. Simply choose the vault for your template from the drop-down menu, provided that the vault is configured in Ment to be used.

Let’s get started with template generation. Add your text blocks in the editor section, then select the text you want to replace with information from M-Files. In our example, we’re using the M-Files vault dbi – Ment as our data source to choose a name from the vault.

Configure a test in M-Files Ment to use a M-Files vault

The next step is to define the question and the M-Files Ment vault metadata that you wish to utilise. In this instance, we are using the employee field to select the name of the employee that we wish to add to the document.

Additionally, the metadata for a template can be configured to store it in the M-Files vault. This predefined metadata will then be used for the documents stored in the vault.
The configuration of this predefined metadata is straightforward. To begin, open the template in M-Files Ment and select the icon for the metadata configured (see the illustration below).

Instruction how to find the metadata configuration button for a template

The example below illustrates the available configuration options based on the M-Files vault configuration. You can define the class, including the class properties and related questions, as well as the workflow to be used, including the workflow state.

Example of the metadata configuration for a template in M-Files Ment
Concluding words

M-Files Ment is the ideal solution for generating documents based on predefined phrases, no coding required! The user can simply go through the questionnaire and, hey presto, the documents are ready to go! What’s more, they can be stored in M-Files, where they can enjoy all the benefits of our fantastic feature set, including workflows, version control, automatic permissions, sharing with colleagues or clients, and much more. Get in touch with us today and we’ll discuss your specific use case and prepare a demo.

If you are currently utilising M-Files Ment and require assistance with configuration, please do not hesitate to contact us. We are pleased to offer our support and assistance.

L’article M-Files Ment Integration configuration est apparu en premier sur dbi Blog.

FreeBSD basics – 8 – Running Linux Containers

Fri, 2024-12-20 00:22

When people talk about containers in FreeBSD, they usually talk about Jails. We’ve already seen how this works in general here and here. An additional option you have, is to install Podman on FreeBSD. Podman is well known in the Linux world and can be used to build and run containers. With the release of FreeBSD 14.2, the FreeBSD project is now also publishing OCI-compatible container images of FreeBSD.

We again start with a fresh installation of FreeBSD 14.2, nothing was added or modified. Podman, as with most of the additional software, can either be installed as a binary package or from the ports collection. If you want to install from the ports collection you can find Podman here:

root@freebsd14-latest:~ $ find /usr/ports/ -name "*podman*" 
/usr/ports/sysutils/podman
/usr/ports/sysutils/podman/files/podman_service.in
/usr/ports/sysutils/podman/files/podman.in
/usr/ports/sysutils/podman-suite

As it is faster to install from binary packages, we’ll do it like this for the scope of this post:

root@freebsd14-latest:~ $ pkg update
The package management tool is not yet installed on your system.
Do you want to fetch and install it now? [y/N]: y
Bootstrapping pkg from pkg+https://pkg.FreeBSD.org/FreeBSD:14:amd64/quarterly, please wait...
Verifying signature with trusted certificate pkg.freebsd.org.2013102301... done
Installing pkg-1.21.3...
Extracting pkg-1.21.3: 100%
Updating FreeBSD repository catalogue...
Fetching meta.conf: 100%    178 B   0.2kB/s    00:01    
Fetching data.pkg: 100%    7 MiB   7.5MB/s    00:01    
Processing entries: 100%
FreeBSD repository update completed. 35568 packages processed.
All repositories are up to date.
root@freebsd14-latest:~ $ pkg upgrade
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Updating database digests format: 100%
Checking for upgrades (1 candidates): 100%
Processing candidates (1 candidates): 100%
Checking integrity... done (0 conflicting)
Your packages are up to date.
root@freebsd14-latest:~ $ pkg install podman
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 33 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        conmon: 2.1.12_3
        containernetworking-plugins: 0.5_9
        containers-common: 0.59.1
        gettext-runtime: 0.22.5
...
--
The FreeBSD port of the Podman container engine is experimental and should be
used for evaluation and testing purposes only.

$ sudo podman run --rm docker.io/dougrabson/hello

Podman can restart containers after a host is rebooted. To enable this, use:

$ sudo sysrc podman_enable=YES

and start the container with a restart policy:

$ sudo podman run -d --restart=always myimage

It is possible to run many Linux container images using FreeBSD's Linux emulation:

$ sudo sysrc linux_enable=YES
$ sudo service linux start
$ sudo podman run --rm --os=linux alpine cat /etc/os-release | head -1
NAME="Alpine Linux"

The package gives you quite some information at the end of the installation process. First of all, this is not yet considered production ready and should be used for testing only. Second, you will need to enable Podman and the Linux compatibility layer:

root@freebsd14-latest:~ $ sysrc podman_enable=YES
podman_enable:  -> YES
root@freebsd14-latest:~ $ sysrc linux_enable=YES
linux_enable: NO -> YES
root@freebsd14-latest:~ $ service linux start

If you want the containers to restart automatically, you should also mount the fdescfs file system:

root@freebsd14-latest:~ $ mount -t fdescfs fdesc /dev/fd
root@freebsd14-latest:~ $ df -h | grep fd
fdescfs               1.0K      0B    1.0K     0%    /compat/linux/dev/fd
fdescfs               1.0K      0B    1.0K     0%    /dev/fd

If you try to search for container images using Podman, you’ll notice that there is no result:

root@freebsd14-latest:~ $ podman search rockylinux

The reason is, that no container registries are defined, but this can easily be fixed by adding a registry to “/usr/local/etc/containers/registries.conf”:

root@freebsd14-latest:~ $ tail -1 /usr/local/etc/containers/registries.conf
unqualified-search-registries = ['docker.io']

Once this is configured, you can search the registry for container images:

root@freebsd14-latest:~ $ podman search rockylinux
NAME                                   DESCRIPTION
docker.io/rockylinux/rockylinux        
docker.io/library/rockylinux           The official build of Rocky Linux.
docker.io/rockylinux/rocky-toolbox     Toolbox image for Rocky Linux - https://gith...
docker.io/rockylinux/rockylinux-shim   RockyLinux shim-review images
docker.io/amd64/rockylinux             The official build of Rocky Linux.
docker.io/litmusimage/rockylinux       
docker.io/arm64v8/rockylinux           The official build of Rocky Linux.
docker.io/ericpaulsen/rockylinux       
docker.io/romcheck/rockylinux          
docker.io/robertdebock/rockylinux      Container to test Ansible roles in, includin...
docker.io/lightnear/rockylinux         
docker.io/accupara/rockylinux          
docker.io/detravi/rockylinux           
docker.io/ppc64le/rockylinux           The official build of Rocky Linux.
docker.io/aursu/rockylinux             
docker.io/s390x/rockylinux             The official build of Rocky Linux.
docker.io/uacontainers/rockylinux      Up-to-date Rocky Linux Docker images with th...
docker.io/buluma/rockylinux            
docker.io/gammabytehosting/rockylinux  Rocky Linux minimal image.
docker.io/whatwewant/rockylinux        
docker.io/kubestation/rockylinux       
docker.io/4geniac/rockylinux           
docker.io/jeksterslab/rockylinux       
docker.io/ihanick/rockylinux           
docker.io/douglarek/rockylinux         

Running a container is exactly the same as you know it from Linux:

root@freebsd14-latest:~ $ podman run -it --rm --platform linux/amd64 rockylinux/rockylinux:9 /bin/bash
Resolving "rockylinux/rockylinux" using unqualified-search registries (/usr/local/etc/containers/registries.conf)
Trying to pull docker.io/rockylinux/rockylinux:9...
Getting image source signatures
Copying blob 3442e16c7069 done   | 
Copying config bb8a97547d done   | 
Writing manifest to image destination
WARN[0009] Failed to load cached network config: network podman not found in CNI cache, falling back to loading network podman from disk 
WARN[0009] Failed to load cached network config: network podman not found in CNI cache, falling back to loading network podman from disk 
Error: plugin type="bridge" failed (add): cni plugin bridge failed: The pf kernel module must be loaded to support ipMasq networks

… and this fails because we forgot to configure the container networking. There actually are instructions how to do this in the output of the Podman package installation:

Message from containernetworking-plugins-0.5_9:

--
Container networking relies on NAT to allow container network packets
out to the host's network. This requires a PF firewall to perform the
translation. A simple example is included - to use it:

# cp /usr/local/etc/containers/pf.conf.sample /etc/pf.conf
...
Edit /etc/pf.conf and set v4egress_if, v6egress_if to your network interface(s)
...
# sysrc pf_enable=YES
# service pf start

The sample PF configuration includes support for port redirections. These are
implemented as redirect rules in anchors nested under cni-rdr.

Support for redirecting connections from the container host to services running
inside a container is included for FreeBSD 13.3 and later. To enable this, first
load the pf kernel module, by adding pf_load="YES" to /boot/loader.conf and
enable PF support for these redirections using sysctl:

# kldload pf
# sysctl net.pf.filter_local=1
# service pf restart

Once this is done, all is fine:

root@freebsd14-latest:~ $ podman run -it --rm --platform linux/amd64 rockylinux/rockylinux:9 /bin/bash
[root@71ef409c1d40 /]$ cat /etc/os-release 
NAME="Rocky Linux"
VERSION="9.5 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.5"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.5 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
VENDOR_NAME="RESF"
VENDOR_URL="https://resf.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2032-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.5"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.5"

Quite impressive and easy to setup. You might wonder if running Linux containers on FreeBSD actually is a thing. Personally, I think that this is really great, because it gives you another option in the FreeBSD world. Containers are everywhere today, and having more choice where to run them is great.

L’article FreeBSD basics – 8 – Running Linux Containers est apparu en premier sur dbi Blog.

Oracle Database Replay for a RAC environment

Thu, 2024-12-19 04:18

The database Capture/Replay feature of the Real Application Testing (RAT) suite enables you to capture the workload from a production database and replay it on a test or development environment. This allows for realistic testing of database changes, such as upgrades, patches, or hardware migrations.

In this blog, we’ll explore how to accomplish this when both your production and test databases are configured in a two-node RAC environment.

Restrictions :

You should be aware that some workload operations are not supported, such as flashback queries and distributed transactions.

In most cases, Database Replay refrains from capturing these operations, or does not replay them. However you should refer to the oracle documentation bellow for the full list of non-supported operations and related considerations :

Oracle Database Documentation

The Workload Capture

The capture procedure on a RAC database is the same as for a single instance database, you just need to run it on one of the instances, without adding an instance filter :

The path you use to store the raw workload files must be defined as a database directory, and must be empty.

1
SQL> CREATE DIRECTORY capture_dir AS '/u90/db_capture/';

It is recommended, but NOT mandatory, to restart your database before the capture, so that all current processes are finished, and the capture would catch absolutely all the changes from the timestamp of its start.

Let’s take for instance a 4 hours workload capture (14400 seconds) :

1
2
3
4
5
6
BEGIN
  DBMS_WORKLOAD_CAPTURE.start_capture (name     => 'capture_4h',
                                       dir      => 'CAPTURE_DIR',
                                       duration => 14400);
END;
/

This package will capture the workload on both RAC instances for 4 hours and will automatically stop. If you need to stop it earlier, you can do it manually using:

1
SQL>  exec DBMS_WORKLOAD_CAPTURE.finish_capture;

The capture process does not significantly impact performance (it adds about 3-5% CPU overhead). However, it may require substantial storage space. A common reason to stop the capture prematurely is running out of space in the designated directory.

Oracle Support Note Doc ID 1920275.1 provides a method to estimate the space required. For a detailed explanation, please refer to the note:

Doc ID 1920275.1

The general formula for the space needed provided by this note is:

 
2 * Bytes received via SQL*Net from client (as seen in AWR or StatsPack).

Alternatively, you can perform a short capture, such as for 10 minutes, and extrapolate the estimate based on the size captured during that period.

At the end of the capture, you should generate an html summary, and export related AWR data, for that, you’ll need the ID of you capture :

1
2
3
4
5
6
7
8
SELECT id, name FROM dba_workload_captures;
SQL>
        ID NAME
---------- ------------------------------
        11 capture_4h
         2 test_capture_1
 
SQL>

So the capture id in my case is : 11

– Generating The html report (It is generated in the directory of the capture) :

1
2
3
4
5
6
SQL> DECLARE
  l_report  CLOB;
BEGIN
  l_report := DBMS_WORKLOAD_CAPTURE.report(capture_id => 11,
                                           format     => DBMS_WORKLOAD_CAPTURE.TYPE_HTML);
END;

– Exporting the AWR data (It generates a datapump dump file contaiting all awr data of the period of the capture) :

1
SQL> exec DBMS_WORKLOAD_CAPTURE.export_awr (capture_id => 11);
The Replay

To prepare the replay you need two things :

– To copy all the files from the directory of the capture to the test system (As it is two nodes RAC it should be on a shared location).

– Prepare a test database : The html report of the capture contains the SCN of the start, it’s best practice for instance to duplicate your database up to that specific SCN, in the test environment, before applying the changes you want to evaluate (for instance an upgrade, or parameters changes).

Once your test database is ready, and the captured files are available on a shared location :

– We create the directory of the replay :

1
SQL> CREATE OR REPLACE DIRECTORY replay_dir AS '/u90/replay/';

The captured raw files must be processed by the database, before the capture could be replayed, this step could take some time depending of the size of the workload raw files :

1
SQL> exec DBMS_WORKLOAD_REPLAY.process_capture('REPLAY_DIR');

Then we initiate the replay (just an initiation, the replay is not yet started) :

1
SQL> exec DBMS_WORKLOAD_REPLAY.initialize_replay(replay_name => 'replay_capture_4h', replay_dir  => 'REPLAY_DIR');

This step generated an entry in the DBA_WORKLOAD_REPLAYS view, we must retrieve the associated id :

1
2
3
4
5
6
7
SQL> select id,status from dba_workload_replays;
 
        ID STATUS
---------- ----------------------------------------
         1 COMPLETED
        11 COMPLETED
        21 INITIALIZED

So, here, the ID of the replay is 21.

Next step is to remap the workload connections. To be able to apply the replay on both instances, we must use a TNS entry to a scan host and scan service :

1
2
3
4
5
6
SQL>
set linesize 250
set pages 9999
spool remap_connections.sql
select 'exec dbms_workload_replay.remap_connection ( connection_id => '||conn_id||', replay_connection => ''(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=a_scan_host)(PORT=your_port))(CONNECT_DATA=(SERVICE_NAME=a_scan_service)))'');' from dba_workload_connection_map where replay_id=21;
spool off

Then we clean and execute the script :

1
2
3
4
5
6
7
8
9
10
sqlplus / as sysdba
SQL>
@remap_connections.sql
…………………
PL/SQL procedure successfully completed.
 
 
PL/SQL procedure successfully completed.
 
SQL>

Next we prepare the replay :

1
2
3
4
5
SQL> exec DBMS_WORKLOAD_REPLAY.prepare_replay (synchronization => TRUE);
 
PL/SQL procedure successfully completed.
 
SQL>

synchronization => TRUE : ensures that the replay sessions are synchronized, maintaining the order and dependency between transactions as they occurred during the workload capture.

Checking the status of the replay :

1
2
3
4
5
6
7
SQL> select id,status from dba_workload_replays;
 
        ID STATUS
---------- ----------------------------------------
         1 COMPLETED
        11 COMPLETED
        21 PREPARE

Next we need to calibrate the replay clients, we use the wrc program in calibrate mode. wrc is The Workload Replay Client tool, used to replay database workload captured by Oracle Database.

The calibration mode purpose is to ensures that the workload replay environment has enough replay clients to handle the workload in the capture. After calibration, the wrc tool will output the minimum number of replay clients required.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[oracle@LinuxHost~]$ wrc mode=calibrate replaydir=/u90/replay
 
Workload Replay Client: Release 19.23.0.0.0 - Production on Wed Dec 11 10:58:06 2024
 
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
 
 
 
Workload Report
-----------------------
Recommendation:
Consider using at least 52 clients divided among 14 CPU(s)
You will need at least 152 MB of memory per client process.
If your machine(s) cannot match that number, consider using more clients.
 
Workload Characteristics:
- max concurrency: 3984 sessions
- total number of sessions: 56136
 
Assumptions:
- 100 concurrent sessions per client process
- 4 client processes per CPU
- 256 KB of memory cache per concurrent session
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE
 
[oracle@LinuxHost ~]$

Based on this calibration, and after checking the cpu and memory available, I will use 56 clients (56 x 152 MB = 8.5G of minimal memory needed for the clients : Ok for me).

And this means using 28 clients per node, as I want to balance the overhead on the two nodes.

The command to start a single client is :

1
[oracle@LinuxHost]$ wrc system/password@test mode=replay replaydir=/u90/replay

But I prepared a script to start 28 clients at once in background mode, for instance in the first node :

 
[oracle@LinuxHost]$ cat replay_node1.sh
#!/bin/bash
 
export ORACLE_HOME=/u01/app/oracle/product/19.0.0.0/dbhome_1
export PATH=/home/oracle/.local/bin:/home/oracle/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/u01/app/oracle/product/19.0.0.0/dbhome_1/bin
 
# Configuration
USER="system"
PASSWORD="Oracle$"
DB_CONNECT="Instance1"
MODE="replay"
REPLAY_DIR="/u90/replay"
LOG_DIR="./wrc_logs"  # Directory to store log files
NUM_CLIENTS=28        # Number of clients to start
 
# Ensure the log directory exists
mkdir -p $LOG_DIR

# Start WRC clients in a loop
for ((i=1; i<=NUM_CLIENTS; i++))
do
nohup wrc ${USER}/${PASSWORD}@${DB_CONNECT} mode=${MODE} replaydir=${REPLAY_DIR} > ${LOG_DIR}/wrc_client_$i.log 2>&1 &
echo "Started WRC client $i, log: ${LOG_DIR}/wrc_client_$i.log"
done
 
echo "All $NUM_CLIENTS clients started. Logs are in $LOG_DIR."
 
[oracle@LinuxHost]
 

starting the clients on the first node :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[oracle@LinuxHost]$ ./replay_node1.sh
Started WRC client 1, log: ./wrc_logs/wrc_client_1.log
Started WRC client 2, log: ./wrc_logs/wrc_client_2.log
Started WRC client 3, log: ./wrc_logs/wrc_client_3.log
Started WRC client 4, log: ./wrc_logs/wrc_client_4.log
Started WRC client 5, log: ./wrc_logs/wrc_client_5.log
Started WRC client 6, log: ./wrc_logs/wrc_client_6.log
Started WRC client 7, log: ./wrc_logs/wrc_client_7.log
Started WRC client 8, log: ./wrc_logs/wrc_client_8.log
Started WRC client 9, log: ./wrc_logs/wrc_client_9.log
Started WRC client 10, log: ./wrc_logs/wrc_client_10.log
Started WRC client 11, log: ./wrc_logs/wrc_client_11.log
Started WRC client 12, log: ./wrc_logs/wrc_client_12.log
Started WRC client 13, log: ./wrc_logs/wrc_client_13.log
Started WRC client 14, log: ./wrc_logs/wrc_client_14.log
Started WRC client 15, log: ./wrc_logs/wrc_client_15.log
Started WRC client 16, log: ./wrc_logs/wrc_client_16.log
Started WRC client 17, log: ./wrc_logs/wrc_client_17.log
Started WRC client 18, log: ./wrc_logs/wrc_client_18.log
Started WRC client 19, log: ./wrc_logs/wrc_client_19.log
Started WRC client 20, log: ./wrc_logs/wrc_client_20.log
Started WRC client 21, log: ./wrc_logs/wrc_client_21.log
Started WRC client 22, log: ./wrc_logs/wrc_client_22.log
Started WRC client 23, log: ./wrc_logs/wrc_client_23.log
Started WRC client 24, log: ./wrc_logs/wrc_client_24.log
Started WRC client 25, log: ./wrc_logs/wrc_client_25.log
Started WRC client 26, log: ./wrc_logs/wrc_client_26.log
Started WRC client 27, log: ./wrc_logs/wrc_client_27.log
Started WRC client 28, log: ./wrc_logs/wrc_client_28.log
All 28 clients started. Logs are in ./wrc_logs.
[oracle@LinuxHost]$

We do the same thing on the second node, we just need to change the DB_CONNECT variable in the script.

Finally, we can start the replay :

1
2
3
4
BEGIN
  DBMS_WORKLOAD_REPLAY.start_replay;
END;
/

The replay is finally started, it can be monitored by :

– Querying the views : dba_workload_replays and v$workload_replay_thread.
– Checking the clients log files.

Hope it helps !

L’article Oracle Database Replay for a RAC environment est apparu en premier sur dbi Blog.

Microsoft Foundry in Action: Building Smarter AI Solutions with Your Data

Mon, 2024-12-16 03:49
1        Introduction

At the 2024 edition of Microsoft Ignite, two key themes dominated the stage: Artificial Intelligence (AI) and Security. These pillars reflect the current and future technological landscape. Staying true to its innovative legacy, Microsoft unveiled Microsoft Foundry, a unified platform designed to simplify the design, customization, and management of AI solutions.

In this article, I’ll share my hands-on experience testing this innovative platform through an intelligent application built with my own data.

2        Context

The integration of AI in our daily operational is no longer a possibility, but it is a present reality. However, many misconceptions still surround AI, leading to hesitancy or misuse. Here’s a reality check:

Misconceptions about AI:

  • AI will replace human jobs.
  • AI freely uses your data to train models.
  • AI is only for large corporations.
  • AI is too complex for average users
  • AI is only useful in tech industries

The Reality:

  • AI handles repetitive tasks, freeing up humans for creative work.
  • Provides a creative starting point (overcoming “blank page” anxiety)
  • Reduces human error in routine tasks
  • Enables 24/7 operational capabilities
  • Facilitates predictive maintenance
  • Your data remains secure and isn’t used to train underlying foundation models without consent.

Recognizing the importance of ethics and safety in AI, Microsoft integrates Responsible AI, Microsoft Purview for data governance, and content safety protocols into its solutions:

  • Responsible AI: Microsoft emphasizes fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability.
  • Microsoft Purview: A comprehensive tool to enforce governance policies that dictate what AI can return to end users.
  • Content Safety: Prevents harmful outputs by monitoring and filtering responses.
3        Intelligent Application Development 3.1      Use Case

For this demonstration, I built an interactive AI solution based on my own data, showcasing Microsoft Foundry’s capabilities.

3.2      Overview of Microsoft Foundry

Microsoft Foundry is a comprehensive platform for developing and deploying generative AI apps responsibly. As stated by Microsoft here, Microsoft is designed for developers to:

  • Build generative AI applications on an enterprise-grade platform.
  • Explore, build, test, and deploy using cutting-edge AI tools and ML models, grounded in responsible AI practices.
  • Collaborate with a team for the full life-cycle of application development.

Microsoft Foundry is incredibly accessible:

  • Free to explore
  • No sign-in required for basic features
  • Full functionality with an Azure account
  • Access via https://int.ai.azure.com/

Its main entry point is as per below picture:

Clicking on Explore models open the model catalog. At the time of writing, the platform boasted an impressive 1,812 models – a number constantly expanding. This vast library means there’s likely a model for almost any use case.

3.3      Prerequisites

Before diving into application development, the following resources were set up:

  • Resource Group (rg-foundrytest): Contains all resources for efficient management and deletion to avoid costs.
  • AI Search (aisearch-foundry): Enables intelligent search capabilities.
  • Azure OpenAI (openai-foundry): Provides foundational AI services.
  • Custom Data: For this demo, I uploaded the file Multiturn-ContosoBenefits.pdf.
3.4      Step-by-Step Implementation 3.4.1    Project Creation

Project is where you do most of your development work. Projects are easy-to-manage containers for your work—and the key to collaboration, organization, and connecting data and other services

From the Foundry portal, click Create Project and proceed with default settings.

Hub is a collaborative environment where Team members can share the projects

Project is created

After setting up the project, necessary resources were verified in the Azure portal.

3.4.2    Adding Own data to the project

The bringing own data feature supports diverse sources, including on-premises and cloud data as depicted in the following picture:

But I simplified the process by uploading  the Multiturn-ContosoBenefits.pdf as follows:

From the new created project homepage, click on New Data

Then, upload the mentioned Multiturn-ContosoBenefits.pdf

The following screen confirms that custom data are added to the project:

3.4.3    Model Selection and Integration

I selected the model GPT-35-turbo-16k and integrated it to the project through following steps:

As already said, more than 1800 models are available

The integrated model gpt-35-turbo-16k (version:0613) at this level is tied with pre-trained data. Thanks to that, the following query returns valid result

3.4.4    Connect the Model to the custom data

Now, we are instructing the model to use our custom data

The following picture confirms that our model is now linked to our custom data:

As of now, the model is restricted tied with our custom data, it is configured to exclusively utilize custom data.

Because of that, when querying content outside the data source’s scope, the system returned an error as shown below:

However, queries related to the data yield provides precise results. In this case, the system provides accurate and context-specific responses. This demonstrates strict adherence to the defined dataset.

It’s remarkable how seamlessly it provides relevant information!

3.4.5    Integrate into Client Applications

Once the model is fine-tuned, its capabilities are easily integrated to client applications following the below architecture:

Their capabilities can be embedded into a client application using pre-generated code snippets available in languages like C#, Curl, Go, Java, Javascript, JSON and Python.

The following picture shows how to get the generated code

The integration can also be easily done by following the wizard to deploy an App Service as shown below:

4        In Summary

Microsoft Foundry is a game-changer in AI application development. It democratizes AI by enabling users—regardless of technical expertise—to:

  • Leverage an extensive catalog of pre-trained models.
  • Tailor applications using custom datasets.
  • Seamlessly integrate AI capabilities into enterprise solutions.

Stay tuned for more hands-on insights in upcoming posts!

L’article Microsoft Foundry in Action: Building Smarter AI Solutions with Your Data est apparu en premier sur dbi Blog.

FreeBSD basics – 7 – The Linux compatibility layer

Thu, 2024-12-12 06:36

Now that we know how to work with the ports collection let’s have a look another option you have in FreeBSD: Sometimes it might be required to run unmodified Linux binaries on FreeBSD and to achieve this, FreeBSD comes with a Linux compatibility layer. This is optional and not enabled by default, but pretty easy to setup and use. Before you do that, please check if the functionality you’re looking for is not already there as either a package or the ports collection. If it is there, use it from there.

To enable the layer at boot time, use “sysrc” as we’ve done it in the previous post and start the corresponding service:

root@freebsd14:~ $ sysrc linux_enable="YES"
linux_enable: NO -> YES
root@freebsd14:~ $ service linux start
root@freebsd14:~ $ service linux describe
Enable Linux ABI

Once the service is running, you’ll see additional mountpoints which Linux applications usually are expecting under “/compat/linux”:

root@freebsd14:~ $ df -h | grep compat
linprocfs                             8.0K      0B    8.0K     0%    /compat/linux/proc
linsysfs                              8.0K      0B    8.0K     0%    /compat/linux/sys
devfs                                 1.0K      0B    1.0K     0%    /compat/linux/dev
fdescfs                               1.0K      0B    1.0K     0%    /compat/linux/dev/fd
tmpfs                                 2.2G    4.0K    2.2G     0%    /compat/linux/dev/shm

This also loaded the required kernel modules:

root@freebsd14:~ $ kldstat | grep -i linux
10    1 0xffffffff82e2f000    30a80 linux.ko
11    4 0xffffffff82e60000     c2a8 linux_common.ko
12    1 0xffffffff82e6d000    2de10 linux64.ko

The next step is to install the Linux userland. Currently, two of those are available (Debian is available with the debootstrap package):

root@freebsd14:~ $ pkg search linux_base
linux_base-c7-7.9.2009_3       Base set of packages needed in Linux mode (Linux CentOS 7.9.2009)
linux_base-rl9-9.4             Base set of packages needed in Linux mode (Rocky Linux 9.4)

As CentOS 7 is end of life, we’ll obviously go for the Rocky Linux 9.4 userland (this is not the most recent minor release as of today, but it is close). You’ll also see those userlands in the ports tree in the “emulators” category:

root@freebsd14:~ $ ls -la /usr/ports/emulators/linux_base-*
/usr/ports/emulators/linux_base-c7:
total 124
drwxr-xr-x    2 root wheel      8 Dec 11 09:49 .
drwxr-xr-x  193 root wheel    194 Dec 11 09:49 ..
-rw-r--r--    1 root wheel   6478 Dec 11 09:49 Makefile
-rw-r--r--    1 root wheel  36074 Dec 11 09:49 distinfo
-rw-r--r--    1 root wheel    468 Dec 11 09:49 pkg-descr
-rw-r--r--    1 root wheel  95620 Dec 11 09:49 pkg-plist.aarch64
-rw-r--r--    1 root wheel 108836 Dec 11 09:49 pkg-plist.amd64
-rw-r--r--    1 root wheel  95512 Dec 11 09:49 pkg-plist.i386

/usr/ports/emulators/linux_base-rl9:
total 132
drwxr-xr-x    2 root wheel      8 Dec 11 09:49 .
drwxr-xr-x  193 root wheel    194 Dec 11 09:49 ..
-rw-r--r--    1 root wheel   6943 Dec 11 09:49 Makefile
-rw-r--r--    1 root wheel   1380 Dec 11 09:49 Makefile.version
-rw-r--r--    1 root wheel  40711 Dec 11 09:49 distinfo
-rw-r--r--    1 root wheel    466 Dec 11 09:49 pkg-descr
-rw-r--r--    1 root wheel  91404 Dec 11 09:49 pkg-plist.aarch64
-rw-r--r--    1 root wheel 104964 Dec 11 09:49 pkg-plist.amd64

As usual, when it comes to binary packages, pkg is used to bring this onto the system:

root@freebsd14:~ $ pkg install linux_base-rl9
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Updating database digests format: 100%
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        linux_base-rl9: 9.4

Number of packages to be installed: 1

The process will require 297 MiB more space.
39 MiB to be downloaded.

Proceed with this action? [y/N]: y
[1/1] Fetching linux_base-rl9-9.4.pkg: 100%   39 MiB  20.5MB/s    00:02    
Checking integrity... done (0 conflicting)
[1/1] Installing linux_base-rl9-9.4...
[1/1] Extracting linux_base-rl9-9.4: 100%

Looking again at the “/compat/linux” structure, this is now populated:

root@freebsd14:~ $ ls -la /compat/linux/
total 39
drwxr-xr-x  11 root wheel  16 Dec 11 17:04 .
drwxr-xr-x   3 root wheel   3 Dec 11 16:15 ..
drwxr-xr-x   2 root wheel   2 Dec 11 17:04 afs
lrwxr-xr-x   1 root wheel   7 May 16  2022 bin -> usr/bin
drwxr-xr-x  10 root wheel 512 Dec 11 17:04 dev
drwxr-xr-x  24 root wheel  60 Dec 11 17:04 etc
lrwxr-xr-x   1 root wheel   7 May 16  2022 lib -> usr/lib
lrwxr-xr-x   1 root wheel   9 May 16  2022 lib64 -> usr/lib64
drwxr-xr-x   2 root wheel   2 Dec 11 17:04 opt
dr-xr-xr-x   1 root wheel   0 Dec 11 17:06 proc
lrwxr-xr-x   1 root wheel   8 Oct 31 02:04 run -> /var/run
lrwxr-xr-x   1 root wheel   8 May 16  2022 sbin -> usr/sbin
drwxr-xr-x   2 root wheel   2 Dec 11 17:04 srv
dr-xr-xr-x   1 root wheel   0 Dec 11 17:06 sys
drwxr-xr-x   8 root wheel   9 Dec 11 17:04 usr
drwxr-xr-x  16 root wheel  17 Dec 11 17:04 var

If we chroot into this, we have a Rocky Linux environment to work with:

root@freebsd14:~ 1 chroot /compat/linux/
sh-5.1$ cat /etc/os-release 
NAME="Rocky Linux"
VERSION="9.4 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.4 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2032-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.4"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.4"

What we don’t have, is neither rpm nor dnf:

sh-5.1$ which rpm
which: no rpm in (/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/root/bin)
sh-5.1$ which dnf
which: no dnf in (/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/root/bin)

rpm can be installed by installing the rpm4 package first:

root@freebsd14:~ $ pkg search rpm4
rpm4-4.18.2_1                  Red Hat Package Manager
root@freebsd14:~ $ pkg install rpm4

Once we have that, we can download a rpm e.g. from the Rocky Linux repositories:

root@freebsd14:/compat/linux $ fetch https://dl.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/Packages/u/unzip-6.0-57.el9.x86_64.rpm
unzip-6.0-57.el9.x86_64.rpm                            179 kB 3851 kBps    00s

Confirm that unzip is not yet available:

root@freebsd14:/compat/linux $ find . -name "unzip"

Install it, and verify it is usable:

root@freebsd14:/compat/linux $ rpm2cpio < unzip-6.0-57.el9.x86_64.rpm   | cpio -id
787 blocks
root@freebsd14:/compat/linux $ find . -name "unzip"
./usr/bin/unzip
./usr/share/licenses/unzip
./usr/share/doc/unzip
root@freebsd14:/compat/linux $ ./usr/bin/unzip --help
UnZip 6.00 of 20 April 2009, by Info-ZIP.  Maintained by C. Spieler.  Send
bug reports using http://www.info-zip.org/zip-bug.html; see README for details.

Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
  Default action is to extract files in list, except those in xlist, to exdir;
  file[.zip] may be a wildcard.  -Z => ZipInfo mode ("unzip -Z" for usage).

  -p  extract files to pipe, no messages     -l  list files (short format)
  -f  freshen existing files, create none    -t  test compressed archive data
  -u  update files, create if necessary      -z  display archive comment only
  -v  list verbosely/show version info       -T  timestamp archive to latest
  -x  exclude files that follow (in xlist)   -d  extract files into exdir
modifiers:
  -n  never overwrite existing files         -q  quiet mode (-qq => quieter)
  -o  overwrite files WITHOUT prompting      -a  auto-convert any text files
  -j  junk paths (do not make directories)   -aa treat ALL files as text
  -U  use escapes for all non-ASCII Unicode  -UU ignore any Unicode fields
  -C  match filenames case-insensitively     -L  make (some) names lowercase
  -X  restore UID/GID info                   -V  retain VMS version numbers
  -K  keep setuid/setgid/tacky permissions   -M  pipe through "more" pager
  -O CHARSET  specify a character encoding for DOS, Windows and OS/2 archives
  -I CHARSET  specify a character encoding for UNIX and other archives

See "unzip -hh" or unzip.txt for more help.  Examples:
  unzip data1 -x joe   => extract all files except joe from zipfile data1.zip
  unzip -p foo | more  => send contents of foo.zip via pipe into program more
  unzip -fo foo ReadMe => quietly replace existing ReadMe if archive file newer

Nice. If you have the requirement to use Linux tools on FreeBSD, this can be very helpful.

L’article FreeBSD basics – 7 – The Linux compatibility layer est apparu en premier sur dbi Blog.

JSON in PostgreSQL : a query tuning case

Wed, 2024-12-11 15:17

Over the years, ORMs have industrialized query generation, allowing developers to focus on optimizing the few queries that truly need it. However, they can sometimes produce overly complex SQL with excessive joins and business logic, making tuning a nightmare. It is sometimes hard to know when you should split a query in multiple smaller parts. The complexity is amplified with the use of JSON, adding another layer for DBAs to manage, although this is not a new data type, best practices on how to use JSON in PostgreSQL is still kind of a dark magic. This blog post intends to put some lights on all of this with a practical example.

Improving ORM-Generated Queries Example of an Inefficient ORM-Generated Query

Below is a truncated and anonymized version of a complex SQL query generated by an ORM which I had to tune since the execution time (+90min) wasn’t in line with the business need. Note that the original query was over 1500 lines :

WITH result_list AS (
    SELECT o.id,
           o.data AS order_data,
           ARRAY_AGG(DISTINCT cr.id) AS customer_refs_id,
           ARRAY_AGG(DISTINCT cr.data) AS customer_refs_data,
           ARRAY_AGG(DISTINCT cp.policy_id) AS customer_policy_id,
           ROW_NUMBER() OVER () AS rowid
    FROM orders o
    LEFT JOIN customer_refs cr ON cr.data->>'id' = o.data->'customer'->>'id'
    LEFT JOIN customer_policies cp ON cp.policy_data->>'id' = o.data->'customer'->>'id'
        AND cp.policy_data->>'region' = ANY (ARRAY['GLOBAL', 'REGION_A', 'REGION_B', 'REGION_C', 'REGION_D'])
        AND (
            (
                cp.policy_data->>'status' IS NOT NULL
                AND cp.policy_data->>'status' NOT IN ('inactive', 'blocked', 'suspended')
            )
            AND (
                (
                    cp.policy_data->>'customer_type' IS NOT NULL
                    AND cp.policy_data->>'customer_type' = 'issuer'
                )
                OR (
                    cp.policy_data->>'customer_type' IS NOT NULL
                    AND cp.policy_data->>'customer_type' IN ('type1', 'type2', 'type3')
                )
                OR (
                    get_text_from_path(cp.policy_data->'shared_groups', '$[*]', 'NONE') IS NOT NULL
                    AND get_text_from_path(cp.policy_data->'shared_groups', '$[*]', 'NONE') && ARRAY['GROUP_1']
                )
            )
        )
    WHERE EXISTS (
        SELECT 1
        FROM account_policies ap
        WHERE ap.policy_id = ARRAY_TO_STRING(ARRAY[
            o.data->'account'->'id'->>'system_ref',
            o.data->'account'->'id'->>'type_ref',
            o.data->'account'->'id'->>'region',
            o.data->'account'->'id'->>'id'], '-')
        AND ap.policy_data->>'region' = ANY (ARRAY['GLOBAL', 'REGION_A', 'REGION_B', 'REGION_C', 'REGION_D'])
        AND ap.policy_data->>'status' NOT IN ('inactive', 'blocked', 'suspended')
        AND ap.policy_data->>'contract' != 'prospect'
        AND (
            ap.policy_data->>'account_type' IN ('typeX', 'typeY', 'typeZ')
            OR get_text_from_path(ap.policy_data->'shared_groups', '$[*]', 'NONE') && ARRAY['GROUP_1']
        )
    )
    AND (
        (
            o.data->>'order_type' = 'MONEY_ORDER'
            AND (
                o.data->'close_date'->>'value' > '2024-09-10'
                OR o.data->'close_date'->>'value' IS NULL
                OR o.data->'close_date'->>'value' = ''
            )
        )
        OR (
            o.data->>'order_type' != 'MONEY_ORDER'
            AND COALESCE(
                NULLIF(o.data->'valuation'->'quantity'->>'value', '')::DECIMAL,
                0
            ) != 0
        )
    )
    AND (
        LOWER(o.data->>'display_name') LIKE '%current%'
        OR LOWER(o.data->'product'->'item'->'item_name'->>'abbreviation') LIKE '%current%'
        OR LOWER(o.data->'product'->'item'->'item_name'->>'full') LIKE '%current%'
        OR LOWER(o.data->'product'->'item'->>'identifier') LIKE '%current%'
        OR LOWER(o.data->'product'->'issuer'->>'display_name') LIKE '%current%'
        OR get_text_for_search_from_path(o.data->'product'->'underlying', '$[*].item.item_name.abbreviation', 'LOWER') LIKE '%current%'
    )
    GROUP BY o.id, o.data
    ORDER BY o.id
),
ordered_list AS (
    SELECT *
    FROM result_list
    ORDER BY rowid
    LIMIT 23
)
SELECT *
FROM ordered_list
ORDER BY rowid;

Key Characteristics of the Query:

  • Multiple Joins: The query includes several LEFT JOIN operations with complex ON conditions involving JSON fields.
  • Nested JSON Access: Frequent use of JSON operators (->, ->>) to access deeply nested fields.
  • Custom Functions: Usage of functions like get_text_from_path and get_text_for_search_from_path, which may not be optimized for performance.
  • Complex WHERE Conditions: The WHERE clause contains multiple nested conditions and subqueries, often involving JSON fields.
Why the Query is Inefficient
  • Accessing deeply nested JSON fields using operators in large datasets can be slow, especially if not properly indexed this will cause full table scans and increase drastically execution time.
  • The query performs numerous LOWER() and LIKE operations on JSON fields, which can be computationally expensive with the addition of preventing index usage effectively.
  • The indexes on JSON fields may not be utilized due to the way conditions are written.
  • The use of LEFT JOIN with conditions that include functions and JSON field comparisons leads to inefficient execution plans.
  • The EXISTS subquery further complicates the execution, potentially causing full table scans.
  • ORMs often generate generic SQL that doesn’t consider database-specific optimizations.
  • The abstraction can lead to redundant or unnecessary conditions and joins.
Result : Bad Execution Plan
Limit  (cost=292646835.98..292646835.99 rows=1 width=1813) (actual time=5504844.738..5504844.745 rows=23 loops=1)
  Buffers: shared hit=3622772239 read=7862262 dirtied=3639
  ->  Sort  (cost=292646835.98..292646835.99 rows=1 width=1813) (actual time=5502255.883..5502255.889 rows=23 loops=1)
        Sort Key: (row_number() OVER (?))
        Sort Method: quicksort  Memory: 114kB
        Buffers: shared hit=3622772239 read=7862262 dirtied=3639
        ->  WindowAgg  (cost=292646835.93..292646835.97 rows=1 width=1813) (actual time=5502254.506..5502255.810 rows=43 loops=1)
              Buffers: shared hit=3622772236 read=7862262 dirtied=3639
              ->  GroupAggregate  (cost=292646835.93..292646835.96 rows=1 width=1805) (actual time=5502253.088..5502253.249 rows=43 loops=1)
                    Group Key: o.id
                    Buffers: shared hit=3622772236 read=7862262 dirtied=3639
                    ->  Sort  (cost=292646835.93..292646835.93 rows=1 width=1930) (actual time=5502244.952..5502244.964 rows=43 loops=1)
                          Sort Key: o.id, cr.id
                          Sort Method: quicksort  Memory: 71kB
                          Buffers: shared hit=3622772236 read=7862262 dirtied=3639
                          ->  Nested Loop  (cost=0.98..292646835.92 rows=1 width=1930) (actual time=2015030.222..5502243.020 rows=43 loops=1)
                                Join Filter: (array_to_string(ARRAY[...]) = ap.policy_id)
                                Rows Removed by Join Filter: 246198062
                                Buffers: shared hit=3622772236 read=7862262 dirtied=3639
---

"Planning:"
"  Buffers: shared hit=649 read=2"
"Planning Time: 133.626 ms"
"JIT:"
"  Functions: 32"
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"
"  Timing: Generation 226.277 ms, Inlining 631.748 ms, Optimization 801.960 ms, Emission 1156.811 ms, Total 2816.797 ms"
"Execution Time: 5505139.477 ms"

Observations from the Execution Plan:

Execution Time: ~90 minutes
Buffer Usage:

  • Shared hits: ~3.6 billion
  • Reads: ~7.8 million
  • Dirtied: ~3,639

Key Issues:

Functions in Joins and Conditions: Prevented effective index usage
Nested Loops: Inefficient row-by-row processing.
Sort Operations: High-cost sorting due to lack of index usage.
Full Table Scans: Extensive rows removed by join filters.

Optimizing the Query

Optimized Query suggested :

Here I modified some things, to help the optimizer use the indexes :

  • Indexes and created some generated columns to avoid applying functions in the where clause on some fields.
  • I changed the LEFT JOIN to a INNER JOIN when possible to reduce the data set early in the query execution.
  • I rewrote some conditions to avoid the use of functions on the JSON fields.
ALTER TABLE orders ADD COLUMN customer_id TEXT GENERATED ALWAYS AS (data->'customer'->>'id') STORED;
ALTER TABLE orders ADD COLUMN order_type TEXT GENERATED ALWAYS AS (data->>'order_type') STORED;
ALTER TABLE orders ADD COLUMN display_name_lower TEXT GENERATED ALWAYS AS (LOWER(data->>'display_name')) STORED;
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
CREATE INDEX idx_orders_order_type ON orders (order_type);
CREATE INDEX idx_orders_display_name_lower ON orders (display_name_lower);
--- ... etc

WITH result_list AS (
    SELECT o.id,
           o.data AS order_data,
           ARRAY_AGG(DISTINCT cr.id) AS customer_refs_id,
           ARRAY_AGG(DISTINCT cr.data) AS customer_refs_data,
           ARRAY_AGG(DISTINCT cp.policy_id) AS customer_policy_id,
           ROW_NUMBER() OVER () AS rowid
    FROM orders o
    INNER JOIN customer_refs cr ON cr.data->>'id' = o.customer_id
    INNER JOIN customer_policies cp ON cp.policy_data->>'id' = o.customer_id
        AND cp.policy_data->>'region' = ANY (ARRAY['GLOBAL', 'REGION_A', 'REGION_B', 'REGION_C', 'REGION_D'])
        AND cp.policy_data->>'status' NOT IN ('inactive', 'blocked', 'suspended')
        AND (
            cp.policy_data->>'customer_type' = 'issuer'
            OR cp.policy_data->>'customer_type' IN ('type1', 'type2', 'type3')
            OR get_text_from_path(cp.policy_data->'shared_groups', '$[*]', 'NONE') && ARRAY['GROUP_1']
        )
    WHERE EXISTS (
        SELECT 1
        FROM account_policies ap
        WHERE ap.policy_id = ARRAY_TO_STRING(ARRAY[
            o.data->'account'->'id'->>'system_ref',
            o.data->'account'->'id'->>'type_ref',
            o.data->'account'->'id'->>'region',
            o.data->'account'->'id'->>'id'], '-')
        AND ap.policy_data->>'region' = ANY (ARRAY['GLOBAL', 'REGION_A', 'REGION_B', 'REGION_C', 'REGION_D'])
        AND ap.policy_data->>'status' NOT IN ('inactive', 'blocked', 'suspended')
        AND ap.policy_data->>'contract' != 'prospect'
        AND (
            ap.policy_data->>'account_type' IN ('typeX', 'typeY', 'typeZ')
            OR get_text_from_path(ap.policy_data->'shared_groups', '$[*]', 'NONE') && ARRAY['GROUP_1']
        )
    )
    AND o.order_type = 'MONEY_ORDER'
    AND (o.data->'close_date'->>'value' > '2024-09-10' OR o.data->'close_date'->>'value' IS NULL OR o.data->'close_date'->>'value' = '')
    AND o.display_name_lower LIKE '%current%'
    GROUP BY o.id, o.data
    ORDER BY o.id
)
SELECT *
FROM result_list
ORDER BY rowid
LIMIT 23;
Updated Execution Plan Analysis

Some DEVs were also working on tuning this query and decided to implement a materialized view which solved the issue differently but with the same intent.

Limit  (cost=9013972.46..9013972.46 rows=1 width=1789) (actual time=175757.446..175757.449 rows=0 loops=1)
  Buffers: shared hit=20716139 read=836484 dirtied=1676
  ->  Sort  (cost=9013972.46..9013972.46 rows=1 width=1789) (actual time=174722.049..174722.052 rows=0 loops=1)
        Sort Key: (row_number() OVER (?))
        Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=20716139 read=836484 dirtied=1676
        ->  WindowAgg  (cost=9013972.40..9013972.45 rows=1 width=1789) (actual time=174722.010..174722.013 rows=0 loops=1)
              Buffers: shared hit=20716136 read=836484 dirtied=1676
              ->  GroupAggregate  (cost=9013972.40..9013972.44 rows=1 width=1781) (actual time=174722.008..174722.011 rows=0 loops=1)
                    Group Key: positions.id
                    Buffers: shared hit=20716136 read=836484 dirtied=1676
                    ->  Sort  (cost=9013972.40..9013972.41 rows=1 width=1906) (actual time=174722.007..174722.009 rows=0 loops=1)
                          Sort Key: positions.id, client_refs.id
                          Sort Method: quicksort  Memory: 25kB
                          Buffers: shared hit=20716136 read=836484 dirtied=1676
                          ->  Nested Loop  (cost=1.41..9013972.39 rows=1 width=1906) (actual time=174722.001..174722.004 rows=0 loops=1)
                                Buffers: shared hit=20716136 read=836484 dirtied=1676
                                ->  Nested Loop Left Join  (cost=0.99..9013961.75 rows=1 width=1933) (actual time=184.358..170945.266 rows=8554 loops=1)
                                      Buffers: shared hit=20683249 read=835147 dirtied=1672
                                      ->  Nested Loop Left Join  (cost=0.56..9013950.91 rows=1 width=1931) (actual time=138.461..146595.622 rows=8554 loops=1)
                                            Join Filter: (client_refs.id = pv.client_id)
                                            Rows Removed by Join Filter: 328,884,138
                                            Buffers: shared hit=20655467 read=828553 dirtied=1153
                                            ->  Nested Loop  (cost=0.56..9008615.75 rows=1 width=1736) (actual time=33.568..75336.287 rows=8554 loops=1)
                                                  Buffers: shared hit=27,870 read=824,707 dirtied=851
                                                  ->  Seq Scan on positions_view pv  (cost=0.00..9008612.97 rows=1 width=78) (actual time=16.384..12504.255 rows=8554 loops=1)
                                                        Filter: ...
                                                        Rows Removed by Filter: 32,144,940
                                                        Buffers: shared hit=7 read=809,465
                                                  ->  Index Scan using positions_pkey on positions positions  (cost=0.56..2.78 rows=1 width=1685) (actual time=7.327..7.327 rows=1 loops=8554)
                                                        Index Cond: (id = pv.id)
                                                        Buffers: shared hit=27,550 read=15,242 dirtied=602
                                            ->  Seq Scan on client_refs  (cost=0.00..4513.96 rows=65,696 width=195) (actual time=0.004..5.734 rows=38,449 loops=8554)
                                                  Buffers: shared hit=20,627,597 read=3,846 dirtied=302
                                      ->  Index Scan using clients_policy_pkey on clients_policy  (cost=0.43..10.84 rows=1 width=26) (actual time=2.841..2.841 rows=0 loops=8554)
                                            Index Cond: (policy_id = pv.client_id)
                                            Filter: ...
                                            Rows Removed by Filter: 1
                                            Buffers: shared hit=27,782 read=6,594 dirtied=519
                                ->  Index Scan using idx_btree_portfolios_policy_id on portfolios_policy  (cost=0.42..10.64 rows=1 width=28) (actual time=0.439..0.439 rows=0 loops=8554)
                                      Index Cond: (policy_id = pv.portfolio_id)
                                      Filter: ...
                                      Rows Removed by Filter: 1
                                      Buffers: shared hit=32,887 read=1,337 dirtied=4
                                      
---

"Planning:"
"  Buffers: shared hit=954 read=78 dirtied=9"
"Planning Time: 130.627 ms"
"JIT:"
"  Functions: 33"
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"
"  Timing: Generation 5.483 ms, Inlining 81.013 ms, Optimization 530.635 ms, Emission 423.850 ms, Total 1040.982 ms"
"Execution Time: 175799.036 ms"

Execution Time: Under 3 minutes

Buffer Usage:

  • Shared hits: ~20.7 million
  • Reads: ~836,484
  • Dirtied: ~1,676

Improvements:

Efficient Joins: Usage of INNER JOIN, limited the dataset early.
Indexed Columns: Enabled rapid data retrieval without full scans.
Simplified Conditions: Eliminating some functions in WHERE clauses, allowed index usage.
Reduced Buffer Operations: Allowing lower I/O overhead significantly.

Comparison to the Previous Execution Plan:

  • Despite processing a large number of rows, the optimized query executes much faster.
  • The use of materialized views allows for more efficient joins and filtering, as data is stored in a format that is easier for the database to handle.

When to Use JSON vs. Traditional Table Columns Use JSON When:
  • Schema Flexibility is Required: If your application deals with varying data structures or frequent schema changes, JSON allows you to store this data without altering the database schema.
  • Storing Semi-Structured or Unstructured Data: JSON is ideal for data that doesn’t fit neatly into a relational schema, such as logs, configuration files, or user-generated content.
  • Aggregating Diverse Data: When consolidating data from multiple sources with different schemas, JSON can accommodate the discrepancies without data loss.
  • Embedding Related Data: Storing related data together as JSON can simplify data retrieval in certain scenarios, reducing the need for complex joins.
Use Traditional Table Columns When:
  • Data Integrity is Crucial: Relational columns allow for constraints, foreign keys, and data types that enforce data validity.
  • Frequent Updates on Specific Fields: Updating a field within a JSON object requires rewriting the entire object, whereas individual columns can be updated more efficiently.
  • Complex Queries and Reporting: SQL excels at querying structured data. If you need to perform complex queries, aggregations, or reports, structured columns are more efficient.
  • Performance Optimization: Indexing and query planning are generally more effective with traditional columns.
Best practices for running efficient Queries on JSON Data Utilize JSON Functions and Operators

PostgreSQL offers a rich set of functions and operators for querying JSON data:

  • Access Operators:
  • -> : Returns a JSON object field by key.
  • ->> : Returns a JSON object field by key as text.
  • #> : Returns JSON sub-object at the specified path.
  • #>> : Returns JSON sub-object at the specified path as text.

Table and Sample Data
postgres=# create database jsonlab;
CREATE DATABASE
postgres=# \c jsonlab
You are now connected to database "jsonlab" as user "postgres".
jsonlab=# CREATE TABLE users (
    id serial PRIMARY KEY,
    data jsonb
);
CREATE TABLE
jsonlab=# INSERT INTO users (data) VALUES
    ('{"name": "Alice", "age": 30, "address": {"city": "New York", "zip": "10001"}}'),
    ('{"name": "Bob", "age": 25, "address": {"city": "Los Angeles", "zip": "90001"}}'),
    ('{"name": "Charlie", "age": 35, "address": {"city": "Chicago", "zip": "60601"}}');
INSERT 0 3
jsonlab=# SELECT
    data->'name' AS name_json,                  -- Returns name as JSON
    data->>'age' AS age_text,                   -- Returns age as text
    data#>'{address}' AS address_json,          -- Returns the address object as JSON
    data#>>'{address,city}' AS city_text        -- Returns the city from the address as text
FROM users;
 name_json | age_text |              address_json               |  city_text
-----------+----------+-----------------------------------------+-------------
 "Alice"   | 30       | {"zip": "10001", "city": "New York"}    | New York
 "Bob"     | 25       | {"zip": "90001", "city": "Los Angeles"} | Los Angeles
 "Charlie" | 35       | {"zip": "60601", "city": "Chicago"}     | Chicago
(3 rows)
Indexing for Performance

To speed up querying JSON data, we can create indexes on the fields being accessed frequently. PostgreSQL allows you to create different types of indexes for JSON and JSONB fields.
Here are some example indexes to support the above query:

  • idx_users_data_gin:
    A general-purpose GIN index on the data column to speed up key-value searches.
  • idx_users_data_name:
    A BTREE index on the name field to quickly access user names.
  • idx_users_data_city:
    A BTREE index on the city field in the address to efficiently retrieve users by their city.
  • idx_users_age_above_30:
    A partial index when conditions are common and you want to reduce the index overhead on age (this is just for the example it doesn’t make sense here).

These indexes will significantly improve the performance of JSON queries, especially when dealing with large datasets.

CREATE INDEX idx_users_data_gin ON users USING gin (data);
CREATE INDEX idx_users_data_name ON users ((data->>'name'));
CREATE INDEX idx_users_data_city ON users ((data#>>'{address,city}'));
CREATE INDEX idx_users_age_above_30 ON users ((data->>'age'))
    WHERE (data->>'age')::int > 30;
Avoid Unnecessary Casting

Because the optimizer is still the same when you are using JSON operators, “hiding” in complexity the intent of your code to the optimizer still has the same drawbacks. Casting JSON data unnecessarily can lead to performance degradation, since the optimizer might not take advantage of present indexes. Using the appropriate operator to retrieve data in the desired format is important.

Leverage JSON Path Queries

Here are some JSON Path Queries principles.
PostgreSQL 12 introduced this, allowing for more advanced querying capabilities. This was a huge step forward to navigate into complex JSON structures. While XML was generally used with highly structured data in the past, this feature made the case for JSONB once more.
In this example, I want to display all orders that contain at least one item priced above 100 :

SELECT data
FROM orders
WHERE data @? '$.items[*] ? (@.price > 100)';
Explanation
  • data: This is the JSONB column storing the order details.
  • @?: This operator is used to evaluate a JSON path predicate and returns true or false based on whether the condition is met.
  • '$.items[*] ? (@.price > 100)':
    • $: Represents the root of the JSON document.
    • items[*]: Selects all elements in the items array.
    • ?: The filter expression that applies a condition.
    • @.price > 100: The condition being checked — in this case, whether the price of any item is greater than 100.

The equivalent of that query in standard relational table would look like that :

SELECT DISTINCT o.order_id
FROM orders o
JOIN items i ON o.order_id = i.order_id
WHERE i.price > 100;
Conclusion

By using materialized views or views to pre-extract JSON fields into regular columns, we can dramatically improve query performance. This approach reduces the computational burden during query execution and allowed the database optimizer to effectively use indexes. Oddly enough, old tricks are still working :).

JSON support in PostgreSQL 17 allow developers and DBAs to design flexible, efficient, and scalable applications. By understanding when and how to use JSON effectively, you can leverage PostgreSQL’s full potential, combining the reliability of a relational database with the flexibility and dynamic schema. In a world where RAG search through vector or graph databases is the new trend, the knowledge of how JSONB and PostgreSQL are working together can bring you closer to the finish line.

L’article JSON in PostgreSQL : a query tuning case est apparu en premier sur dbi Blog.

FreeBSD basics – 6 – The ports collection

Wed, 2024-12-11 08:27

If you followed the previous introduction posts about FreeBSD (here, here, here, here and here) you should already know how to install binary packages. On FreeBSD this is done using pkg, and if you are familiar with apt, dnf, or zypper (or any other package manager on Linux) this should not be a big deal. There is, however, another official way to get additional software onto FreeBSD and this are the so called “ports”. In contrast to the binary packages, ports are source code that comes with all the patches and Makefiles which are required to compile and install a specific piece of program on FreeBSD. Think of it as an automated way of compiling software for FreeBSD.

Depending on how you installed FreeBSD the ports collection might already be there:

root@freebsd14:~ $ ls -l /usr/ports/ | wc -l
      81

If it’s not there, you have a couple of options to get it installed. Either you get it with git (the following will get the HEAD branch, check the link for a description how to checkout a quarterly branch):

root@freebsd14:~ $ pkg install git
root@freebsd14:~ $ git clone --depth 1 https://git.FreeBSD.org/ports.git /usr/ports

… or you get it with portsnap:

root@freebsd14:~ $ mkdir -p /var/db/portsnap
root@freebsd14:~ $ portsnap fetch
root@freebsd14:~ $ portsnap extract

… or you download the the tarball from the website. No matter how you do it, once the ports are available you’ll see something like this:

root@freebsd14:/usr/ports $ pwd
/usr/ports
root@freebsd14:/usr/ports $ ls
.arcconfig      Keywords        accessibility   comms           finance         japanese        net-im          science         x11-clocks
.git            MOVED           arabic          converters      french          java            net-mgmt        security        x11-drivers
.gitignore      Makefile        archivers       databases       ftp             korean          net-p2p         shells          x11-fm
.hooks          Mk              astro           deskutils       games           lang            news            sysutils        x11-fonts
.mailmap        README          audio           devel           german          mail            polish          textproc        x11-servers
CHANGES         Templates       benchmarks      dns             graphics        math            ports-mgmt      ukrainian       x11-themes
CONTRIBUTING.md Tools           biology         editors         hebrew          misc            portuguese      vietnamese      x11-toolkits
COPYRIGHT       UIDs            cad             emulators       hungarian       multimedia      print           www             x11-wm
GIDs            UPDATING        chinese         filesystems     irc             net             russian         x11

Each of those directories is a category, and the categories contain the ports grouped into that category:

root@freebsd14:~ $ ls /usr/ports/databases/
Makefile                                        p5-DBIx-Log4perl                                py-aiosqlite
R-cran-DBI                                      p5-DBIx-MySQLSequence                           py-alembic
R-cran-RMySQL                                   p5-DBIx-NoSQL                                   py-apache-arrow
R-cran-RPostgreSQL                              p5-DBIx-Password                                py-apsw
...
p5-DBIx-Inspector                               py-aiopg                                        xtrabackup84
p5-DBIx-Introspector                            py-aioredis                                     zodb
p5-DBIx-Lite                                    py-aiosql

There are thousands of ports and looking for something in the directory tree might become time consuming. Searching for something can be done with e.g. “whereis“:

root@freebsd14:~ $ whereis puredb
puredb: /usr/ports/databases/puredb

… or you search directly in the ports tree, but before you can do this you need to build the index file:

root@freebsd14:/usr/ports $ make search postgres
The search target requires INDEX-14. Please run make index or make fetchindex.
make: don't know how to make postgres. Stop

make: stopped in /usr/ports

root@freebsd14:/usr/ports $ make index
Generating INDEX-14 - please wait..
 Done.
root@freebsd14:/usr/ports $ 

This created the INDEX file which is used as the source for searches in the tree:

root@freebsd14:/usr/ports $ ls -l INDEX-14 
-rw-r--r--  1 root wheel 48207086 Dec 11 10:13 INDEX-14
root@freebsd14:/usr/ports $ make search name=postgresql | less
Port:   R-cran-RPostgreSQL-0.7.7
Path:   /usr/ports/databases/R-cran-RPostgreSQL
Info:   R Interface to the 'PostgreSQL' Database System
Maint:  tota@FreeBSD.org
...
Port:   databases/postgresql11-pgtcl
Moved:  databases/postgresql12-pgtcl
Date:   2023-12-31
Reason: Has expired: Security support ended on 2023-11-09

The README comes with additional hints and information:

root@freebsd14:/usr/ports $ cat README 
This is the FreeBSD Ports Collection.  For an easy to use
WEB-based interface to it, please see:

        https://ports.FreeBSD.org

For general information on the Ports Collection, please see the
FreeBSD Handbook ports section which is available from:

        https://docs.freebsd.org/en/books/handbook/ports/
                for the latest official version
        or:
        The ports(7) manual page (man ports).

These will explain how to use ports and packages.

If you would like to search for a port, you can do so easily by
saying (in /usr/ports):

        make search name="<name>"
        or:
        make search key="<keyword>"

which will generate a list of all ports matching <name> or <keyword>.
make search also supports wildcards, such as:

        make search name="gtk*"

For information about contributing to FreeBSD ports, please see the Porter's
Handbook, available at:

        https://docs.freebsd.org/en/books/porters-handbook/

NOTE:  This tree will GROW significantly in size during normal usage!
The distribution tar files can and do accumulate in /usr/ports/distfiles,
and the individual ports will also use up lots of space in their work
subdirectories unless you remember to "make clean" after you're done
building a given port.  /usr/ports/distfiles can also be periodically
cleaned without ill-effect.

Lets say we want to install PostgreSQL 17, what do we need to do. Actually this is quite easy. First, enter the ports directory:

root@freebsd14:/usr/ports $ cd databases/postgresql17-server/
root@freebsd14:/usr/ports/databases/postgresql17-server $

If you’re curious, have a look at the Makefile to see what will happen, otherwise just install the port (“clean” will cleanup afterwards to save space on disk):

root@freebsd14:/usr/ports/databases/postgresql17-server $ make install clean

This will bring up a new windows which gives you some options to either enable or disable:

Once you’re happy with it, let it build and install. This will start downloading the PostgreSQL sources and then ask for options for bison:

This goes on for all the tools required to build PostgreSQL:

… (and so on) and finally you’ll see the usual “configure” and “make” stuff on the screen (this will run the regression tests as well, so expect it to take some time). If you just want to get a feeling about ports, maybe don’t chose PostgreSQL as your first port, because this really takes time as all the dependencies (llvm, …, ) get build from source as well.

Once it is done you’ll find the binaries in “/usr/local/bin/”:

root@freebsd14:/usr/ports/databases/postgresql17-server $ ls /usr/local/bin/pg*
/usr/local/bin/pg_amcheck               /usr/local/bin/pg_ctl                   /usr/local/bin/pg_restore
/usr/local/bin/pg_archivecleanup        /usr/local/bin/pg_dump                  /usr/local/bin/pg_rewind
/usr/local/bin/pg_basebackup            /usr/local/bin/pg_dumpall               /usr/local/bin/pg_test_fsync
/usr/local/bin/pg_checksums             /usr/local/bin/pg_isready               /usr/local/bin/pg_test_timing
/usr/local/bin/pg_config                /usr/local/bin/pg_receivewal            /usr/local/bin/pg_upgrade
/usr/local/bin/pg_controldata           /usr/local/bin/pg_recvlogical           /usr/local/bin/pg_waldump
/usr/local/bin/pg_createsubscriber      /usr/local/bin/pg_resetwal              /usr/local/bin/pgbench
root@freebsd14:/usr/ports/databases/postgresql17-server $ /usr/local/bin/psql --version
psql (PostgreSQL) 17.2

All the question which have been answered can also be given on the command line to automate the whole process (‘-‘ means do not include, ‘+’ means include):

root@freebsd14:/usr/ports/databases/postgresql17-server $ make pretty-print-config
-DEBUG +DOCS -DTRACE -GSSAPI -LDAP +LLVM +LZ4 +NLS -OPTIMIZED_CFLAGS -PAM +SSL -TZDATA +XML +ZSTD 

A more verbose description of those can be shown like this:

root@freebsd14:/usr/ports/databases/postgresql17-server $ make showconfig
===> The following configuration options are available for postgresql17-server-17.2:
     DEBUG=off: Build with debugging support
     DOCS=on: Build and/or install documentation
     DTRACE=off: Build with DTrace probes
     GSSAPI=off: Build with GSSAPI support
     LDAP=off: Build with LDAP authentication support
     LLVM=on: Build with support for JIT-compiling expressions
     LZ4=on: Build with LZ4 compression support
     NLS=on: Use internationalized messages
     OPTIMIZED_CFLAGS=off: Builds with compiler optimizations (-O3)
     PAM=off: Build with PAM Support
     SSL=on: Build with OpenSSL support
     TZDATA=off: Use internal timezone database
     XML=on: Build with XML data type
     ZSTD=on: Build with ZSTD compression support
===> Use 'make config' to modify these settings

To summarize: A lot of additional stuff is available in the ports collection. If something is not available as a package, have a look there.

If you want to see what’s going in the ports collection, have a look at FreshPorts.

L’article FreeBSD basics – 6 – The ports collection est apparu en premier sur dbi Blog.

Help yourself out with the ‘log_reuse_wait_desc’ column

Tue, 2024-12-10 03:54

A few days ago a customer called me to ask if I could help him with a problem with the log file.

He tried to shrink it, but the system wasn’t doing anything and he didn’t know why.

With the column ‘log_reuse_wait_desc’ you can find it out.

When I asked him what was in the ‘log_reuse_wait_desc’ column for the database, there was silence at first, then the question quickly arose as to where the column could be found at all.

But let’s start at the beginning.

I arranged a session with the customer and asked him to show me what he had in mind.

He tried to reduce the log file to the smallest possible size with the following command:

The output is successfull and nothing has happened.

But why?

I quickly found out that this was a database located in an AlwaysOn group.

Some time before his call, an application had made large changes to the database by mistake. This caused the log file to grow enormously. From there also the desire to restore the old size.

In the well-known system table sys.databases there is a column with the name ‘log_reuse_wait_desc’ which tells us why it does not work.

So I executed a query based on the name of the database and the column:

The reuse_wait ‘AVAILABILITY_REPLICA’ was recognizable.

This means that the log file is replicating the transactions that have been carried out. Until this has happened, the log file cannot be manipulated.

We therefore had to wait until all transactions had been replicated and the status had changed:

On the status ‘ACTIVE_TRANSACTION’, we are already able to shrink the previous enlargement.

Then we could reduce the size of the file.

Are there any other log waits?

The following are listed in the official Microsoft documentation:

  • NOTHING
  • CHECKPOINT
  • LOG_BACKUP
  • ACTIVE_BACKUP_OR_RESTORE
  • ACTIVE_TRANSACTION
  • DATABASE_MIRRORING
  • REPLICATION
  • DATABASE_SNAPSHOT_CREATION
  • LOG_SCAN
  • AVAILABILITY_REPLICA
  • OLDEST_PAGE
  • XTP_CHECKPOINT
  • SLOG_SCAN

Customers often ask me exactly the opposite. Why can’t I reduce the size of the log file even though I have only made small changes to the database?

The main problem lies with the VLFs.

When the transaction log is not growing, and you’re taking regular log backups, but the log_reuse_wait_desc stays at LOG_BACKUP, the reason is that the previous log backup did not clear any Virtual Log Files (VLFs).

How can this occur?

Consider a database with minimal insert, update, delete, or DDL activity. In this scenario, only a few log records are generated between your regular log backups, and they all reside in the same Virtual Log File (VLF). When the next log backup runs, it backs up these few log records but cannot clear the current VLF, preventing the log_reuse_wait_desc from changing. Once enough changes are made in the database to fill the current VLF and activate the next one, the subsequent log backup should be able to clear the previous VLF, causing the log_reuse_wait_desc to change to NOTHING. However, if the next log backup cannot clear the current VLF, the log_reuse_wait_desc will revert to LOG_BACKUP again.

If you want to find out more about VFLs, take a look at my colleague’s blog entry. He has covered the topic in great detail:

L’article Help yourself out with the ‘log_reuse_wait_desc’ column est apparu en premier sur dbi Blog.

Building multi-architecture images with GitLab CI/CD

Fri, 2024-12-06 03:06

Building multi-architecture images become more and more useful. Indeed, many recent computers use ARM processors architecture. Examples include MacBooks using M(x) processors, and Amazon EC2 instances using AWS Graviton processors.

However, the diversification of processor architectures adds a new level of complexity to the creation of container images. Indeed, the construction has to cope with different instruction sets.

Docker buildx, the solution for building multi-architecture images

For the YaK project, we want to make amd64 (x86) and arm64 images available using GitLab CI/CD.

In order to create a build compatible with several architectures, I had to use “docker buildx” in my .gitlab-ci.yml file:

build:
  image: docker:latest
  stage: build
  services:
    - docker:dind
  before_script:
    - docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
    - docker buildx create --name yakbuilder --use
  script:
    - docker buildx build --pull --builder=yakbuilder --platform linux/amd64,linux/arm64 -t [IMG]:[TAG] --push .
How it works:
  • In the "before_script" section , I initialize a QEMU container to emulate ARM architecture and to create a buildx context using the QEMU container
  • In the "script" section itself, instead of a simple “docker build”, I use the "docker buildx build" command
  • I also pass the buildx context created in the "before_script" with the --builder flag
  • Finally, I add the list of architectures required for the build with the --platform flag
Build Result

With this method, the build is slower. That’s normal as several images are created (one per architecture) instead of just one.

The result can be seen in the GitLab container registry:

container registry details

Now, below the image tag, a small “index” label is shown. This refers to the fact that several images are available for this tag. During the image pull, the container engine will choose the image version corresponding to its architecture.

Conclusion

With buildx and QEMU in GitLab CI/CD, building multi-architecture images is easy. You can manage different processor architectures and meet the needs of a wide range of users and ensure the compatibility of your container images.

L’article Building multi-architecture images with GitLab CI/CD est apparu en premier sur dbi Blog.

Oracle Database on Apple Silicon with Virtualbox 7.1

Thu, 2024-12-05 12:57

I blogged in the past about running an Oracle database for testing purposes on Apple Silicon. Initially with emulated Intel code here and then (when Oracle released its 19c database software for ARM) with UTM as a wrapper for the QEMU hypervisor here. In the meantime another possibility came up since Oracle released its type II hypervisor Virtualbox for ARM in version 7.1.

Below the steps to setup an Oracle database from scratch on a Virtualbox VM on my MacBook Air M1.

Install Virtualbox and a VM with Oracle Enterprise Linux 8

First things first: Install Virtualbox. To do that go to www.virtualbox.org and click on the download link:

Click on “macOS / Apple Silicon hosts”, which downloads VirtualBox-7.1.4-165100-macOSArm64.dmg (at the time when writing this Blog):

In Finder double click the DMG to install Virtualbox. Go through the usual installation process until you see the following screen:

When starting Virtualbox the first time I set the Expert mode:

But before creating a VM I downloaded Oracle Enterprise Linux 8 for ARM from here:

Then I create a new VM by clicking “New” in Virtualbox:

Provided the information requested:

Then clicked “Finish”.

Before starting the VM I enabled host caching under Settings -> Storage -> Controller: VirtioSCSI:

I set the network to bridged, so that I also can reach the VM remotely for some tests (if you want to access your VM remotely, you have to adjust the firewall on your Mac):

Then I powered the machine on by clicking “Start”. I had to provide access to the keyboard and computer in the security settings of the Mac. The VM booted from the ISO and I could see the installation screen and choose to install Linux 8:

Then choose the language and other settings and begin the installation:

Then reboot the system:

Install Oracle 19c (19.25.)

After the reboot I can login to the machine through ssh as root and prepare for the Oracle installation:

dnf -y update 
dnf -y install oracle-database-preinstall-19c.aarch64

mkdir -p /u01/app/oracle
chown -R oracle:oinstall /u01/app

systemctl stop firewalld

passwd oracle

dnf -y install xterm.aarch64 tigervnc-server.aarch64

The Oracle software can be downloaded here. I uploaded it to the /tmp directory on my just created VM and installed Oracle 19c:

su - oracle
mkdir -p /u01/app/oracle/product/19.0.0/dbhome_1
cd /u01/app/oracle/product/19.0.0/dbhome_1

unzip -oq /tmp/LINUX.ARM64_1919000_db_home.zip

export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_HOSTNAME=`hostname`
export ORA_INVENTORY=/u01/app/oraInventory


./runInstaller -ignorePrereq -waitforcompletion -silent             \
    -responseFile ${ORACLE_HOME}/install/response/db_install.rsp    \
    oracle.install.option=INSTALL_DB_SWONLY                         \
    ORACLE_HOSTNAME=${ORACLE_HOSTNAME}                              \
    UNIX_GROUP_NAME=oinstall                                        \
    INVENTORY_LOCATION=${ORA_INVENTORY}                             \
    SELECTED_LANGUAGES=en,en_GB                                     \
    ORACLE_HOME=${ORACLE_HOME}                                      \
    ORACLE_BASE=${ORACLE_BASE}                                      \
    oracle.install.db.InstallEdition=EE                             \
    oracle.install.db.OSDBA_GROUP=dba                               \
    oracle.install.db.OSBACKUPDBA_GROUP=backupdba                   \
    oracle.install.db.OSDGDBA_GROUP=dgdba                           \
    oracle.install.db.OSKMDBA_GROUP=kmdba                           \
    oracle.install.db.OSRACDBA_GROUP=racdba                         \
    SECURITY_UPDATES_VIA_MYORACLESUPPORT=false                      \
    DECLINE_SECURITY_UPDATES=true

su - root
/u01/app/oraInventory/orainstRoot.sh
/u01/app/oracle/product/19.0.0/dbhome_1/root.sh
exit

Then install the 19.25. release update, which I copied to /tmp/19.25:

cd $ORACLE_HOME
mv OPatch OPatch_1919
unzip -oq /tmp/19.25/p6880880_190000_Linux-ARM-64.zip 
cd /tmp/19.25
unzip -oq p36912597_190000_Linux-ARM-64.zip 
cd 36912597/
$ORACLE_HOME/OPatch/opatch apply

Start a listener and install a DB

$ORACLE_HOME/bin/lsnrctl start

export ORACLE_SID=db19ee
export PDB_NAME=pdb1
export DATA_DIR=$ORACLE_BASE/oradata
mkdir -p $DATA_DIR

$ORACLE_HOME/bin/dbca -silent -createDatabase                            \
      -templateName General_Purpose.dbc                                   \
      -gdbname ${ORACLE_SID} -sid  ${ORACLE_SID} -responseFile NO_VALUE   \
      -characterSet AL32UTF8                                              \
      -sysPassword oracle_4U                                              \
      -systemPassword oracle_4U                                           \
      -createAsContainerDatabase true                                     \
      -numberOfPDBs 1                                                     \
      -pdbName ${PDB_NAME}                                                \
      -pdbAdminPassword oracle_4U                                         \
      -databaseType MULTIPURPOSE                                          \
      -memoryMgmtType auto_sga                                            \
      -totalMemory 2500                                                   \
      -storageType FS                                                     \
      -datafileDestination "${DATA_DIR}"                                  \
      -redoLogFileSize 200                                                \
      -emConfiguration NONE                                               \
      -ignorePreReqs

$ORACLE_HOME/bin/sqlplus / as sysdba <<EOF
alter system set db_create_file_dest='${DATA_DIR}';
alter pluggable database ${PDB_NAME} save state;
exit;
EOF

As root adjust the firewalld:

# systemctl start firewalld
# cd /etc/firewalld/services
# vi oranetwork.xml
# cat oranetwork.xml
<?xml version="1.0" encoding="utf-8"?> <service>
<short>oranetwork</short>
<description>Oracle Network communication</description> <port protocol="tcp" port="1521"/>
</service>
# firewall-cmd --permanent --add-service=oranetwork
# firewall-cmd --reload 
# firewall-cmd --list-all
...
  services: cockpit dhcpv6-client oranetwork ssh
...

Finally I can connect to my 19.25.-DB:

[oracle@db19c-arm ~]$ . oraenv
ORACLE_SID = [db19ee] ? db19ee
The Oracle base remains unchanged with value /u01/app/oracle
[oracle@db19c-arm ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Dec 5 19:35:34 2024
Version 19.25.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.25.0.0.0

SQL> show pdbs

    CON_ID CON_NAME			  OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
	 2 PDB$SEED			  READ ONLY  NO
	 3 PDB1 			  READ WRITE NO
SQL> 
Summary

Consultants and DBAs who are used to run Oracle Database test environments on Virtualbox, can do this now on Apple Silicon as well. The steps to setup and run an Oracle DB on ARM are the same as on x86-64. Please consider that there are still some limitations on ARM. E.g. 19c is not supported on Oracle Enterprise Linux 9 (OEL9) on ARM.

L’article Oracle Database on Apple Silicon with Virtualbox 7.1 est apparu en premier sur dbi Blog.

FreeBSD basics – 5 – Providing services using Jails

Thu, 2024-12-05 00:46

This post builds on top of the previous post in which we’ve created a simple thin Jail. The goal of this post is, to put a bit of automation around this and to provide a simple PostgreSQL as a service platform. We want to do this by creating one PostgreSQL instance per Jail (without any restrictions on resource usage for now, no monitoring and so on). Before we start with this, let’s remove the sample thinjail1 we’ve created in the last post:

root@freebsd14:~ $ jls
   JID  IP Address      Hostname                      Path
     1  192.168.122.160 thinjail1                     /usr/local/jails/containers/thinjail1
root@freebsd14:~ $ service jail stop thinjail1
Stopping jails: thinjail1.
root@freebsd14:~ $ chflags -R 0 /usr/local/jails/containers/thinjail1
root@freebsd14:~ $ rm -rf /usr/local/jails/containers/thinjail1/*
root@freebsd14:~ $ jls
   JID  IP Address      Hostname                      Path
root@freebsd14:~ $ zfs destroy zroot/jails/containers/thinjail1

We’ll also remove the ZFS snapshot because we’ll be creating a new one containing additional packages and configuration:

root@freebsd14:~ $ zfs destroy zroot/jails/templates/14.1-RELEASE@base
root@freebsd14:~ $ zfs list -t snapshot
NAME                                       USED  AVAIL  REFER  MOUNTPOINT
zroot/ROOT/default@2024-11-26-10:56:43-0   394M      -  1.46G  -
zroot/ROOT/default@2024-12-04-08:41:41-0  2.23M      -  2.49G  -
zroot/ROOT/default@2024-12-04-08:42:46-0  1.86M      -  2.58G  -

Remember that this was our base for the Jail:

root@freebsd14:~ $ ls -al /usr/local/jails/templates/14.1-RELEASE/
total 95
drwxr-xr-x  18 root wheel   22 Dec  2 13:35 .
drwxr-xr-x   3 root wheel    3 Dec  2 13:29 ..
-rw-r--r--   2 root wheel 1011 May 31  2024 .cshrc
-rw-r--r--   2 root wheel  495 May 31  2024 .profile
-r--r--r--   1 root wheel 6109 May 31  2024 COPYRIGHT
drwxr-xr-x   2 root wheel   49 Dec  2 13:40 bin
drwxr-xr-x  15 root wheel   69 Dec  2 13:40 boot
dr-xr-xr-x   2 root wheel    2 May 31  2024 dev
drwxr-xr-x  30 root wheel  105 Dec  2 13:40 etc
drwxr-xr-x   4 root wheel   78 Dec  2 13:40 lib
drwxr-xr-x   3 root wheel    5 May 31  2024 libexec
drwxr-xr-x   2 root wheel    2 May 31  2024 media
drwxr-xr-x   2 root wheel    2 May 31  2024 mnt
drwxr-xr-x   2 root wheel    2 May 31  2024 net
dr-xr-xr-x   2 root wheel    2 May 31  2024 proc
drwxr-xr-x   2 root wheel  150 Dec  2 13:40 rescue
drwxr-x---   2 root wheel    7 May 31  2024 root
drwxr-xr-x   2 root wheel  150 Dec  2 13:40 sbin
lrwxr-xr-x   1 root wheel   11 May 31  2024 sys -> usr/src/sys
drwxrwxrwt   2 root wheel    2 May 31  2024 tmp
drwxr-xr-x  14 root wheel   14 May 31  2024 usr
drwxr-xr-x  24 root wheel   24 May 31  2024 var

What we’re doing now is to install PostgreSQL into this base so we’ll have that available immediately when we create new Jails afterwards. To do this we can ask pkg to perform it’s actions against a chroot environment instead of the host system by using the “-c” switch:

root@freebsd14:~ $ pkg -c /usr/local/jails/templates/14.1-RELEASE/ install postgresql17-server-17.2 postgresql17-contrib-17.2
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 17 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        gettext-runtime: 0.22.5
        icu: 74.2_1,1
        indexinfo: 0.3.1
        libedit: 3.1.20240808,1
        libffi: 3.4.6
        liblz4: 1.10.0,1
        libxml2: 2.11.9
        llvm15: 15.0.7_10
        lua53: 5.3.6_1
        mpdecimal: 4.0.0
        perl5: 5.36.3_2
        postgresql17-client: 17.2
        postgresql17-contrib: 17.2
        postgresql17-server: 17.2
        python311: 3.11.10
        readline: 8.2.13_1
        zstd: 1.5.6

Number of packages to be installed: 17

The process will require 1 GiB more space.
231 MiB to be downloaded.

Proceed with this action? [y/N]: y
[1/17] Fetching indexinfo-0.3.1.pkg: 100%    6 KiB   5.9kB/s    00:01    
[2/17] Fetching libxml2-2.11.9.pkg: 100%  873 KiB 893.6kB/s    00:01    
...
To use PostgreSQL, enable it in rc.conf using

  sysrc postgresql_enable=yes

To initialize the database, run

  service postgresql initdb

You can then start PostgreSQL by running:

  service postgresql start

For postmaster settings, see ~postgres/data/postgresql.conf

NB. FreeBSD's PostgreSQL port logs to syslog by default
    See ~postgres/data/postgresql.conf for more info

NB. If you're not using a checksumming filesystem like ZFS, you might
    wish to enable data checksumming. It can be enabled during
    the initdb phase, by adding the "--data-checksums" flag to
    the postgresql_initdb_flags rcvar. Otherwise you can enable it later by
    using pg_checksums.  Check the initdb(1) manpage for more info
    and make sure you understand the performance implications.

We can easily verify this by looking at /usr/local/bin of our base:

root@freebsd14:~ $ ls -al /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg*
-rwxr-xr-x  1 root wheel  99224 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_amcheck
-rwxr-xr-x  1 root wheel  43616 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_archivecleanup
-rwxr-xr-x  1 root wheel 161128 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_basebackup
-rwxr-xr-x  1 root wheel  78336 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_checksums
-rwxr-xr-x  1 root wheel  53784 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_config
-rwxr-xr-x  1 root wheel  57272 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_controldata
-rwxr-xr-x  1 root wheel 101656 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_createsubscriber
-rwxr-xr-x  1 root wheel  72136 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_ctl
-rwxr-xr-x  1 root wheel 422632 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_dump
-rwxr-xr-x  1 root wheel 121896 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_dumpall
-rwxr-xr-x  1 root wheel  67904 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_isready
-rwxr-xr-x  1 root wheel 103840 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_receivewal
-rwxr-xr-x  1 root wheel 101768 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_recvlogical
-rwxr-xr-x  1 root wheel  68312 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_resetwal
-rwxr-xr-x  1 root wheel 207752 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_restore
-rwxr-xr-x  1 root wheel 153560 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_rewind
-rwxr-xr-x  1 root wheel  49024 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_test_fsync
-rwxr-xr-x  1 root wheel  39088 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_test_timing
-rwxr-xr-x  1 root wheel 178256 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_upgrade
-rwxr-xr-x  1 root wheel 110984 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pg_waldump
-rwxr-xr-x  1 root wheel 207416 Dec  3 02:06 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pgbench
root@freebsd14:~ $ ls -al /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/po*
-rwxr-xr-x  1 root wheel    4149 Oct 31 02:09 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pod2html
-rwxr-xr-x  1 root wheel   15046 Oct 31 02:09 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pod2man
-rwxr-xr-x  1 root wheel   10815 Oct 31 02:09 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pod2text
-rwxr-xr-x  1 root wheel    4113 Oct 31 02:09 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/pod2usage
-rwxr-xr-x  1 root wheel    3664 Oct 31 02:09 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/podchecker
-rwxr-xr-x  1 root wheel 9548040 Dec  3 02:18 /usr/local/jails/templates/14.1-RELEASE/usr/local/bin/postgres

In the same way we can ask pkg to operate in a chroot, we can also ask sysrc to operate in a chroot and directly enable the PostgreSQL service:

root@freebsd14:~ $ sysrc -R /usr/local/jails/templates/14.1-RELEASE/ postgresql_enable=yes
postgresql_enable:  -> yes
root@freebsd14:~ $ cat /usr/local/jails/templates/14.1-RELEASE/etc/rc.conf
postgresql_enable="yes"

Now we have everything we need and can create a new ZFS snapshot:

root@freebsd14:~ $ zfs snapshot zroot/jails/templates/14.1-RELEASE@postgresql17
root@freebsd14:~ $ zfs list -t snapshot
NAME                                              USED  AVAIL  REFER  MOUNTPOINT
zroot/ROOT/default@2024-11-26-10:56:43-0          394M      -  1.46G  -
zroot/ROOT/default@2024-12-04-08:41:41-0         2.23M      -  2.49G  -
zroot/ROOT/default@2024-12-04-08:42:46-0         1.86M      -  2.58G  -
zroot/jails/templates/14.1-RELEASE@postgresql17     0B      -  1.34G  -

Looking at the configuration for the Jail in the last post, it looks like this:

root@freebsd14:~ $ cat /etc/jail.conf.d/thinjail1.conf 
thinjail1 {
  # STARTUP/LOGGING
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.consolelog = "/var/log/jail_console_${name}.log";

  # PERMISSIONS
  allow.raw_sockets;
  exec.clean;
  mount.devfs;

  # HOSTNAME/PATH
  host.hostname = "${name}";
  path = "/usr/local/jails/containers/${name}";

  # NETWORK
  ip4.addr = 192.168.122.160;
  interface = vtnet0;
}

The dynamic parts in that configuration are the IP address, the path, the hostname and the name of the Jail. Everything else remains static for additional Jails. So we could create a simple script which creates a new Jail configuration based on a few parameters (there is no error handling, no sanity checks, and everything else which makes a good script a good script, this really is just for demonstration purposes):

#!/usr/local/bin/bash

# The parameters for a new Jail
JAILNAME="$1"
IPADDR="$2"

# This is where all the Jails go to
BASEPATH="/usr/local/jails/containers/"

cat << EOF > /etc/jail.conf.d/${JAILNAME}.conf
${JAILNAME} {
  # STARTUP/LOGGING
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.consolelog = "/var/log/jail_console_${name}.log";
  # PERMISSIONS
  allow.raw_sockets;
  allow.sysvipc = 1;
  exec.clean;
  mount.devfs;
  # HOSTNAME/PATH
  host.hostname = "${JAILNAME}.it.dbi-services.com";
  path = "${BASEPATH}/${JAILNAME}";
  # NETWORK
  ip4.addr = ${IPADDR};
  interface = vtnet0;
}
EOF

zfs clone zroot/jails/templates/14.1-RELEASE@postgresql17 zroot/jails/containers/${JAILNAME}
service jail start ${JAILNAME}
jls
jexec -U postgres ${JAILNAME} /usr/local/bin/initdb --pgdata=/var/db/postgres/data17
jexec -l ${JAILNAME} service postgresql start

What this script is doing:

  • Create the Jail configuration based on the name and IP address
  • Clone the ZFS snapshot we’ve created above
  • Start the Jail
  • Initialize PostgreSQL
  • Start up PostgreSQL

If you run that it looks like this:

root@freebsd14:~ $ ./new_jail.sh PG1 192.168.122.130
Starting jails: PG1.
   JID  IP Address      Hostname                      Path
    11  192.168.122.130 PG1.it.dbi-services.com       /usr/local/jails/containers/PG1
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/db/postgres/data17 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default "max_connections" ... 100
selecting default "shared_buffers" ... 128MB
selecting default time zone ... Europe/Vaduz
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/local/bin/pg_ctl -D /var/db/postgres/data17 -l logfile start

2024-12-04 10:28:32.103 CET [4809] LOG:  ending log output to stderr
2024-12-04 10:28:32.103 CET [4809] HINT:  Future log output will go to log destination "syslog".

That’s it, PostgreSQL is up and running in the Jail:

root@freebsd14:~ $ psql -h 192.168.122.130
psql: error: connection to server at "192.168.122.130", port 5432 failed: FATAL:  no pg_hba.conf entry for host "192.168.122.130", user "root", database "root", no encryption

Pretty easy to do. Now additional PostgreSQL Jails can be added easily. What is really impressive is the size of the Jail on disk:

root@freebsd14:~ $ du -sh /usr/local/jails/containers/PG1
1.4G    /usr/local/jails/containers/PG1

A bit more than a GB for a complete PostgreSQL service.

L’article FreeBSD basics – 5 – Providing services using Jails est apparu en premier sur dbi Blog.

Pages