Greenplum LDAP Guide
This article includes the following sections:
- An Introduction to LDAP
- A note on using Active Directory
- Basic LDAP Authentication
- Authenticating Against an Indirect LDAP Attribute
- Synchronizing LDAP with the Database
LDAP is a protocol for centrally managing users, groups, and their access permissions. Applications or services across an organization can authenticate against a single company-wide LDAP server, instead of each implementing their own user management and access control systems.
In a typical Greenplum or PostgreSQL installation users and groups are manually created as roles within the database. Those roles can login or authenticate using a variety of authentication methods, but most are typically configured to use encrypted passwords. Administrators grant those users privileges to use and create database objects.
This system is simple, secure, and effective for most small database use cases. However, for use cases with many users or larger organizations, authenticating against LDAP can offer many benefits, such as:
- Centralized user management
- DBAs not responsible for creating/modifying database users and permissions
- Single password for employees across all applications and services
- Enforcement of corporate password policies, such as minimum password length and password rotation
- Logging of authentication attempts in a central location
- Easier security and compliance auditing
The below diagram illustrates the steps involved in authenticating with Greenplum or PostgreSQL through LDAP. Note that the user interacts only with the database server and all communication with LDAP is handled transparently.
The nomenclature used in LDAP environments is a bit different than database administrators may be used to. The following is a list of terms DBAs needs to know to successfully implement LDAP authentication:
Bind - A bind is an authentication request against the LDAP server. An application can send a bind request containing a user’s name and password, then the LDAP server will try to authenticate that user and respond with either success or failure.
DN - A Distinguished Name is a string that uniquely identifies a user. DNs are not equivalent to user names in the database since they contain extra information used by the LDAP server. For example, the database user ‘john’ might be identified by the LDAP DN ‘uid=john,ou=People,dc=example,dc=com’
Microsoft’s Active Directory makes use of the LDAP protocol, and as such Greenplum can be configured to authenticate against Active Directory in the same way it can be for LDAP. All of the examples below should work for Active Directory just as they would for a stand alone LDAP server.
Before you begin setting up LDAP authentication you need the following information:
- Format of DNs since it will vary from company to company
- The URL of the LDAP server and corresponding firewall permissions to access it
- The port of the LDAP service – 389 by default
- An authorized LDAP DN to begin testing with
- OpenLDAP’s client tools installed on the master server for testing and debugging
OpenLDAP is an open source project that implements LDAP. Their software includes a stand-alone LDAP server as well as a set of client tools for working with LDAP servers. These client tools are extremely useful when configuring LDAP authentication with Greenplum or PostgreSQL. These tools can be downloaded here and should be installed on your cluster’s master server to aid in testing and debugging of the LDAP connection.
First, make sure networking between the LDAP server and the Greenplum master is configured correctly by using the netcat utility from the master. Telnet could also be used if netcat is not available. If this command fails then you should troubleshoot your network connection.
$ nc -v ldap_hostname 389 Connection to ldap_hostname 389 port [tcp/ldap] succeeded!
Next, use the ‘ldapsearch’ utility (provided by OpenLDAP) to test binding to the LDAP server. This will ensure the format of your DN is correct and your permissions have been configured correctly on the LDAP server.
$ ldapsearch -W -h ldap_hostname -D "uid=john,ou=People,dc=example,dc=com" -b "dc=example,dc=com" cn Enter LDAP Password:
If the bind was successful you should see output similar to the following. Pay special attention to the ‘search result’ section, which should show ‘0 Success’ as the result:
# extended LDIF # # LDAPv3 # base <uid=john,ou=People,dc=example,dc=com> with scope subtree # filter: (objectclass=*) # requesting: cn # # john, People, example.com dn: uid=john,ou=People,dc=example,dc=com cn: John Doe # search result search: 2 result: 0 Success # numResponses: 2 # numEntries: 1
Now that we have verified that master has connectivity to the LDAP server and our user has permission to bind, we can configure Greenplum to use LDAP as the authentication method.
For basic authentication, the database username must be part of the LDAP DN. If your database usernames are not part of the DN, see the below section that describes how to authenticate against a different LDAP attribute. In the example above our dn is “uid=john,ou=People,dc=example,dc=com” so we must make sure the username “john” exists in the database.
demo=# create user john; CREATE ROLE
Once we are sure the user exists we can modify the pg_hba.conf configuration file to set the authentication method for our user. Since the DN is different from the database username, and the structure of DNs can vary drastically between organizations, we must instruct Greenplum on how to construct the DN based on the username. We do this by specifying the “ldapprefix” and “ldapsuffix” parameters in our configuration. Greenplum will concatenate the ldapprefix, database username, and ldapsuffix to create the DN it sends to the LDAP server. For our example the pg_hba.conf entry should look as follows:
host all john 0.0.0.0/0 ldap ldapserver=ldap_hostname ldapprefix="uid=" ldapsuffix=",ou=People,dc=example,dc=com"
At this point LDAP authentication is configured and you should be able to login to the Greenplum or PostgreSQL server as normal.
$ psql -h 127.0.0.1 -U john Password for user john: psql (8.2.15) Type "help" for help. demo=>
In the basic LDAP authentication scenario described above it was required that the database username be part of the LDAP DN. However, in some cases it is desirable for the database username to be an LDAP attribute that is not part of the DN.
For example, it might be the case in an organization that the DN contains an employee number such as: “cn=801223,dc=example,dc=com”. We could use basic authentication if all database usernames where equal to employee numbers, but this generally makes DBA tasks more difficult and decreases overall database usability. If our LDAP record had another attribute such as ‘username’ it would be much better to use that as the database login instead.
To make this work we need the following additional information:
- Name of the LDAP attribute we wish to use as the database username
- A dedicated DN and password that we can use for searching the LDAP directory
Like with basic authentication, we should first use OpenLDAP’s ldapsearch utility to verify our information is correct. We do need to modify the ldapsearch command slightly so that we bind with the utility DN and perform a search on the desired attribute. The below command uses the “Manager” LDAP account to return the full DN for the LDAP record with a uid of “john”.
$ ldapsearch -W -D "cn=Manager,dc=example,dc=com" -b "dc=example,dc=com" "uid=john"
Once we verify that is working as intended we can modify our pg_hba.conf entry to the following:
host all john 0.0.0.0/0 ldap ldapserver=ldap_host ldapbasedn="dc=example,dc=com" ldapbinddn="cn=Manager,dc=example,dc=com" ldapbindpasswd="changeme" ldapsearchattribute="uid"
Now when the user john attempts to login, Greenplum will first perform a search to find that user’s full DN before trying to bind as that user.
One major downside to LDAP authentication in the database is that the user accounts must be manually created in the database before a user can login. They must also be manually maintained and kept in sync with the LDAP server. If a new account is added to LDAP then a DBA must be responsible for also adding it to the database before they can login. We also can’t take advantage of LDAP group membership without a DBA being aware of the group hierarchies and keeping them in sync.
These problem can be solved by setting up an automated process that synchronizes Greenplum and the LDAP server. Many DBAs have historically doen this by writing custom scripts on top of the OpenLDAP search tools described above and scheduling them to run through Cron, but it can also be achieved very simply by using the excellent open source pg-ldap-sync tool.
Installation is very straight forward and only requires that Ruby and Rubygems be installed on the master server. For complete installation instructions see the pg-ldap-sync Github project page. The instructions say either the ‘pg’ or ‘postgres-pr’ gems can be installed, but be aware that installing ‘postgres-pr’ will be easier since the ‘pg’ gem must be compiled and links against PostgreSQL header files.
$ gem install pg-ldap-sync postgres-pr
Once installation is complete you need only create a simple configuration file that tells pg-ldap-sync how to connect to the LDAP and Greenplum servers and which records need to be synchronized. Here is a simple configuration file based on our examples above:
ldap_connection: host: localhost auth: method: :simple username: cn=Manager,dc=example,dc=com password: changeme ldap_users: base: ou=People,dc=example,dc=com filter: (&(objectClass=inetOrgPerson)(uid=*)) name_attribute: uid ldap_groups: base: ou=Group,dc=example,dc=com filter: (|(cn=dba)(cn=reporting)) name_attribute: cn member_attribute: member pg_connection: hostaddr: localhost dbname: postgres user: gpadmin password: pg_users: filter: rolcanlogin AND NOT rolsuper create_options: LOGIN pg_groups: filter: NOT rolcanlogin AND NOT rolsuper create_options: NOLOGIN grant_options:
As an example let us assume that our LDAP directory contains two groups named “dba” and “reporting” with one user named “john” who is a member of both groups. We can run the pg_ldap_sync utility with the -t and -vv flags to run it in testing mode and see the operations it would run:
$ pg_ldap_sync -c sync.yaml -t -vv I, INFO -- : found user-dn: uid=john,ou=People,dc=example,dc=com I, INFO -- : found group-dn: cn=dba,ou=Group,dc=example,dc=com I, INFO -- : found group-dn: cn=reporting,ou=Group,dc=example,dc=com I, INFO -- : found pg-user: "john" I, INFO -- : user stat: create: 0 drop: 0 keep: 1 I, INFO -- : group stat: create: 2 drop: 0 keep: 0 I, INFO -- : membership stat: grant: 2 revoke: 0 keep: 0 I, INFO -- : SQL: CREATE ROLE "dba" NOLOGIN I, INFO -- : SQL: CREATE ROLE "reporting" NOLOGIN I, INFO -- : SQL: GRANT "reporting" TO "john" I, INFO -- : SQL: GRANT "dba" TO "john"
We can see that the user “john” already existed in the database so it decided to keep it. It also determined both groups needed to be created and “john” needed to be added to them.
Now that we see it is working as expected we can add this job to Cron so it will automatically be run nightly (or even more often) to keep Greenplum in sync with the LDAP server. In this way users and their group memberships will be kept in sync without intervention from a DBA. Remember that authentication must still be configured using the basic method or the more advcanced method described above.
LDAP is a powerful way to centrally manage users and groups in an organization. Greenplum or PostgreSQL can be easily configured to take advantage of LDAP or Active Directory as an authentication source. Combined with an automated synchronization process, such as pg-ldap-sync, most user, group, and permission administration tasks no longer need to be handled by a DBA.