Dillon Woods
Founder @
Tackle

Greenplum Kerberos Guide

May, 2013

This guide was developed in collaboration with Zeb Bowden and the rest of the team at the Virginia Tech Transportation Institute.

This article includes the following sections:

Conceptual Overview

Our primary goal when configuring Kerberos authentication is to enable single sign-on in the environment. That is, users should have to authenticate by providing a password once when they login to their workstation and should thereafter be able to access any system or service in the organization without ever re-authenticating.

Throughout this guide we will be referring to the scenario where a user logs into their Windows workstation via Active Directory and then accesses a web service that pulls data from a Greenplum database. After logging into Windows, the user’s credentials should be passed through the web service to Greenplum without requiring the user to authenticate again. The following diagram illustrates the process from the user’s perspective:

Kerberos Authentication Figure 1

Although this appears very simple to the user, the process behind the scenes is quite a bit more complicated. The first step is for the user to login to their Windows workstation by authenticating with Active Directory. After the user enters their password the system will authenticate them with AD, and if successful they will be given a Kerberos “ticket” that is stored on the client system and proves that their authentication was successful.

Kerberos Authentication Figure 2

Before the user can access Greenplum they must get a second Kerberos ticket that allows them to authenticate with the database. This ticket is obtained automatically as soon they try to access the database, whether it be through a desktop application or a web service. The ticket they already received is used for authentication, so the user is not required to type in their password again.

Kerberos Authentication Figure 3

Once the user receives their Greenplum service ticket they can proceed by passing that ticket to the database. Greenplum will use the ticket to securely prove the user’s identity and allow them to login, but permissions to database objects will still be managed internally by the database. Note that the user can access the database directly using a client like PGAdmin, or an application or web service can use Kerberos delegation to pass the user’s ticket through to the database automatically as shown below:

Kerberos Authentication Figure 4

Kerberos Introduction

Kerberos is a single sign-on protocol developed by researchers at MIT. It specifies how users or services can authenticate with a central service, and once authenticated can pass secure “tickets” amongst themselves to prove their respective identities. As described above, this protocol is commonly used in a Greenplum or PostgreSQL environment to allow users to automatically authenticate with the database without requiring them to provide any credentials such as a password.

Kerberos Implementations

There are two popular implementations of the Kerberos protocol widely used today. The first is MIT’s open source Kerberos implementation. The second is Microsoft Kerberos, which is provided by Active Directory. These systems are similar, since they both implement the same underlying protocol, but there are difference that could affect configuration. The examples in this guide assume Microsoft Kerberos is used as the KDC.

Glossary

The following terms are unique to the Kerberos protocol and should be understood before beginning configuration.

Ticket - A ticket is a secure, encrypted credential issued by a KDC or TGS that proves the identity of a user or service

Authentication Server (AS) - An Authentication Server is responsible for authenticating users via a password. If authentication is successful, the AS forwards requests to the KDC

Key Distribution Center (KDC) - A Key Distribution Center is a service responsible for issuing Ticket Granting Tickets

Ticket Granting Ticket (TGT) - A Ticket Granting Ticket is a special type of Ticket that proves a user has been authenticated with an Authentication Server

Ticket Granting Service (TGS) - A Ticket Granting Service receives user requests to access services, such as Greenplum. It first verifies users are authenticated via the user’s TGT, then grants them a Ticket for the requested service if the user is permitted to access it

Service Server (SS) - A Service Server is a server that provides a particular service. For example, the Greenplum master host is the SS for the Greenplum service

Principal - A Principal is a unique identifier for a given user or service. is typically takes the form of primary/instance@realm, for example dwoods/[email protected]

Service Principal Name (SPN) - A Service Principal Name maps a Kerberos principal to a service. For example, the gpadmin principal might be mapped to the postgres SPN

Certificate Cache (CC) - A local file store for Kerberos tickets that have been issues to a particular user. Typically a file in /tmp in Linux or an in-memory store on Windows

GSS-API

GSSAPI stands for Generic Security Services Application Program Interface and is a high level interface, or API, for accessing security services such as Kerberos. As discussed previously, all Kerberos implementations are all slightly different since the Kerberos API has not been standardized. By supporting and exposing GSSAPI, Kerberos implementations give clients and services a standard way to connect to and use Kerberos. GSSAPI should always be the preferred way to use Kerberos when available. Unfortunately GSSAPI is not supported by Greenplum at this time, so the Kerberos API must be used instead.

Greenplum Kerberos Configuration

This section will walk you through the process of configuring Kerberos authentication in Greenplum. To complete these steps you will need administrator level access to Active Directory and Greenplum. It is also recommended you have a third machine for testing client Kerberos connections. This guide was tested with Greenplum, but should work equally well with PostgreSQL.

Windows Configuration

First, in the Active Directory Management Console add a new Managed Service Account named ‘gpadmin’. This account will be used by the Greenplum database to perform authentication.

Add Managed Service Account

In either Windows PowerShell or cmd.exe, use the setspn utility to associate a service called postgres with the Managed Service Account just created. Setting the GPMasterHostname correctly is very important. Be sure it is on the same network as the Active Directory server and that the hostname can be fully resolved. Test this using nslookup if you are unsure.

$ setspn -A postgres/<GPMasterHostname> gpadmin

Once the Service Principal Name has been associated with the gpadmin account you can open the user properties in the Active Directory Management Console to verify the User Login is set correctly and that the password for the gpadmin account never expires.

GPAdmin Active Directory Properties

Next, export a key file that the Greenplum service can use to authenticate with Kerberos using the ktpass utility from Windows PowerShell. This will allow Greenplum to acquire a Ticket from Kerberos without a password when the database is started.

$ ktpass -out greenplum.keytab -princ postgres/<masterHost>@<domain> -mapUser gpadmin -pass Greenplum1 -crypto DES-CBC-MD5

Note that the encryption type specified with the -crypto parameter may vary depending on your version and site specific configuration of Active Directory.

Put the generated key file on the GP master where it can be accessed by the gpadmin user. $MASTER_DATA_DIREFounderRY/kerberos.keytab may be a good choice, but be sure it is owned by gpadmin, that it also located in the same place on your standby master, and that it is backed up properly.

NTP Synchronization

It is important that system times on the Active Directory and Greenplum servers not be more than 5 minutes apart. It is suggested that you use Network Time Protocol (NTP) to keep the server times in sync. On the Greenplum master you can configure this through /etc/ntp.conf, but keep in mind that the master system clock must also be synced with all of the segment hosts.

Greenplum Server Configuration

The following steps should all be done on the Greenplum Master server as well as the Standby Master. All modified configuration files should be backed up regularly.

Make the following changes to Greenplum’s global configuration file, which can be found at $MASTER_DATA_DIREFounderRY/postgresql.conf

krb_server_keyfile = '<location of key file>'
krb_srvname = 'postgres'

Add the following variables to GPAdmin’s environment by modifying /home/gpadmin/.bashrc

export KRB5_KTNAME=<location of key file>
export KRB5_TRACE=/tmp/krb5_gp.log

Note that the second environment variable, KRB5_TRACE, is used only for debugging purposes and can be removed at the administrator’s discretion.

Make the following changes to the system Kerberos configuration file located at /etc/krb5.conf

[libdefaults]
 debug=true
 default_etypes = rc4-hmac
 default_realm = <Realm>
 allow_weak_crypto = 1
 dns_lookup_realm = false
 dns_lookup_kdc = false
 forwardable = yes

[realms]
 <Realm> = {
  kdc = <KDCHostname>:88
  admin_server = <KDCHostname>:749
 }

[domain_realm]
 <KDCHostname> = <Realm>
 .<KDCHostname> = <Realm>
 .<otherDomain> = <Realm>

The following values are examples of what might go in the above configuration:

  • = EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM
  • = ec2-107-22-98-252.compute-1.amazonaws.com
  • = .ec2.internal

Make sure that any users who should authenticate through Kerberos exist in the database. Also be sure each user exists in Active Directory. See the Greenplum LDAP Guide for a solution to synchronize Active Directory users with the database.

demo=# create user testuser;
demo=# select * from pg_user where usename='testuser';
usename  | usesysid | usecreatedb | usesuper | usecatupd |  passwd  | valuntil | useconfig 
----------+----------+-------------+----------+-----------+----------+----------+-----------
 testuser |    17774 | f           | f        | f         | ******** |          | 
(1 row)

All authentication in Greenplum is configured through a file called pg_hba.conf located in Greenplum’s $MASTER_DATA_DIREFounderRY. Once your users exist in the database you must also add them to this file to enforce Kerberos authentication. The following is an example of what your pg_hba.conf entry should look like:

    host    all    testuser,myuser    0.0.0.0/0    krb5    include_realm=0

User names can be separated by commas if you have more than one to add, or you can specify that an entire group should authenticate with Kerberos by using the @groupname syntax:

    host    all    @kerberos_group    0.0.0.0/0    krb5    include_realm=0

Also be sure to specify a valid IP address range for connection, 0.0.0.0/0 should be used for testing purposes only.

After all of the above Greenplum configuration changes are complete you should conclude by restarting the database using gpstop and gpstart.

Testing

Once the database restart is complete you can begin testing your configuration. Use the klist utility on the master to verify gpadmin has authenticated with Kerbers and has acquired a Ticket Granting Ticket, krbtgt, and a service ticket for postgres.

$ klist
    Ticket cache: FILE:/tmp/krb5cc_500
    Default principal: [email protected]

    Valid starting     Expires            Service principal
    05/17/13 11:38:50  05/17/13 21:39:02  krbtgt/EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM@EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM
        renew until 05/18/13 11:38:50
    05/17/13 11:39:05  05/17/13 21:39:02  postgres/ip-10-140-0-121.ec2.internal@EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM
        renew until 05/18/13 11:38:50

Next you should attempt to make a connection to the database from a different machine. Be sure the machine you use for testing has connectivity to both Active Directory and Greenplum, has Kerberos utilities installed, and has the same krb5.conf configuration as the Greenplum master.

Start by authenticating the database user you want to test with Kerberos.

$ kdestroy 
$ klist
    klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_0)
$ kinit testuser
    Password for [email protected]: 
$ klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: [email protected]

    Valid starting     Expires            Service principal
    05/20/13 22:28:24  05/21/13 08:27:53  krbtgt/EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM@EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM
        renew until 05/21/13 22:28:24

Once the user is authenticated with Kerberos and has been granted a Ticket Granting Ticket, krbtgt, you can test using the same user to connect to Greenplum.

$ psql -U testuser -h 10.140.0.121 -d demo
Type "help" for help.

demo=>

Note that you should not be prompted for a password when connecting. After connecting successfully you can further verify authentication is working properly by checking that the user was granted an additional ticket for the postgres service:

$ klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: [email protected]

    Valid starting     Expires            Service principal
    05/20/13 22:28:24  05/21/13 08:27:53  krbtgt/EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM@EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM
        renew until 05/21/13 22:28:24
    05/20/13 22:28:07  05/21/13 08:27:53  postgres/ip-10-140-0-121.ec2.internal@EC2-107-22-98-252.COMPUTE-1.AMAZONAWS.COM
        renew until 05/21/13 22:28:24

ODBC Configuration

At this point we have configured Greenplum and Active Directory for Kerberos authentication and have proven clients can successfully connect using the native psql utility. The next step is to allow users to connect through tools or applications by configuring ODBC on the client machines.

The ODBC driver you choose to use must support the krb5 protocol. In the following examples we will use the Greenplum ODBC driver provided by the greenplum-connectivity package, but the open source psqlODBC driver should work as well. Note that the Greenplum ODBC driver provided by DataDirect does not support Kerberos authentication.

Linux

First, download the latest greenplum-connectivity package and install it on the client system using the included documentation. Once installed you should create an odbc.ini file and an entry for your Greenplum cluster. The following is an example of what that entry should look like:

    [GP]
    Description         = Test to Greenplum
    Driver              = /usr/local/greenplum-connectivity/drivers/odbc/psqlodbc-09.00.0200/unixodbc-2.2.12/psqlodbcw.so
    Trace               = 1
    Debug               = 1
    TraceFile           = /home/gpadmin/sql.log
    Database            = demo
    Servername          = 10.140.0.121
    UserName            =
    Password            =
    Port                = 5432
    Protocol            = 7.4-1
    ReadOnly            = No
    RowVersioning       = No
    ShowSystemTables    = Yes
    ShowOidColumn       = No
    FakeOidIndex        = No
    useDeclareFetch     = 1
    Fetch               = 4096
    ConnSettings        =

Make sure your $ODBCINI environment variable points to the odbc.ini file you just created:

$ export ODBCINI=/home/gpadmin/odbc.ini

Also be sure to source greenplum_connectivity_path.sh. For continued use this line should be added to the user’s .bashrc:

    $ source /usr/local/greenplum-connectivity/greenplum_connectivity_path.sh

Authenticate with Kerberos and verify you received a Ticket Granting Ticket as described in the Testing section above. Finally, test that the ODBC connection is working by using the isql utility.

$ isql -3 GP testuser
    +---------------------------------------+
    | Connected!                            |
    |                                       |
    | sql-statement                         |
    | help [tablename]                      |
    | quit                                  |
    |                                       |
    +---------------------------------------+
    SQL>

Note that we passed the parameter -3 to tell isql to use ODBC 3 calls and the GP parameter so isql would use that entry from our odbc.ini.

You are now ready to connect any Linux tools or applications to Greenplum using this ODBC driver. All database connections will use Kerberos and authenticate using the local user’s TGT.

Windows

Similarly, you can configure an ODBC driver in any Windows environment to allow Windows based tools or applications to connect to the database. As with the Linux configuration, first download and install the latest greenplum-connectivity package for Windows. After it is installed you must configure a DSN for Greenplum using Microsoft’s ODBC Data Source Administrator utility. Note that the Greenplum Windows ODBC driver is 32 bit, so the 32 bit version of Data Source Administrator must be used. On a 64 bit Windows system the 64 bit Data Source Administrator will be opened by default and you will not be able to configure the Greenplum driver. The 32 bit version can be accessed by running the following command:

c:\Windows\SysWOW64\odbcad32.exe

Choose the “PostgreSQL Unicode” driver when adding the DSN and then proceed with configuring it with your cluster details. Be sure the “SSL Mode” option is set to “allow”.

Data Source Administrator

Add a new Windows Registry Key to tell the ODBC driver how to access the Certificate Cache. The key should be located in \HKLM\SOFTWARE\Wow6432Node\MIT\Kerberos5. It should have the name ‘ccname’ and should be of type REG_SZ. The value should be ‘MSLSA:‘, which stands for the Microsoft Local Security Authority.

Kerberos Registry Key

Once configured you should be able to login to the Windows system using a database user account and press the “Test” button in the above Setup dialogue box to verify the connection is working.

Application Authentication Delegation

It is common for an internal application or web service to connect to the Greenplum database using an application or service account. For example, a BI tool like Cognos might handle end-user authentication itself and always connect to the database using a special ‘cognos’ user in Greenplum.

Kerberos Delegation Figure 1

Although this works in the majority of cases, it may sometimes be advantageous to pass the responsibility of user authentication through to the database. In this scenario the application or web service will receive the user’s TGT and pass it on to Greenplum instead of authenticating the user itself. This process is know as Kerberos Delegation.

Kerberos Delegation Figure 2

Since the database authenticates each user directly this method allows for much more fine grained access control of database objects. It also eliminates the need to create or maintain application accounts. The downside to this approach is that the application loses some control over the uesrs connecting to it. It can’t employ user specific logic or access control.

To use Kerberos Delegation you need only make sure the users and services involved are given delegation permission in Active Directory. This is typically the default setting for users and setspn should automatically enable delegation for Managed Service Accounts, but if you run into problems on this step you may need to explicitly enable delegation.

Testing

To test Kerberos Delegation, create and deploy a simple IIS web application that uses a system ODBC DSN to make a connection to Greenplum and displays the name of the connected user. When users access that web application from their desktops they should see their username displayed as the user making the connection. The currently connected user can be displayed with the following simple SQL statement after connecting:

SELECT current_user;

Zeb Bowden has created an excellent test application called KerbTest that is freely available on GitHub. As described above, it can be deployed to an IIS server and if Kerberos delegation has been configured properly it will display the name of the end user accessing the application. Complete instructions for use are included in the README.txt available at the GitHub repo.

KerbTest Screen Shot

Conclusion

Enabling Kerberos authentication in Greenplum is an easy way to provide database connectivity in your existing Single Sign-On infrastructure. It simplifies Greenplum configuration since passwords need not be managed by the database, and when combined with Active Directory Synchronization database users can be managed almost entirely without DBA intervention. Further, applications and services in your organizations can delegate their authentication to Greenplum to simplify application development.

comments powered by Disqus