Adding an Integration Services Catalog to Always On Availability Groups – Subquery Error

This month’s T-SQL Tuesday topic by Wayne Sheffield (t) is “Tell me about your Brick Wall and how you overcame it.”  I hadn’t originally planned this post for this reason, but it’s right on topic.

 

I recently discovered an issue while adding an Integration Services Catalog to a SQL Server 2016 Availability Group.  SQL Server 2016 introduced support for having a SSIS Catalog (SSISDB) in an Always On Availability Group.  This provides a high-availability and disaster-recovery solution for the packages, projects, executions logs, and environments found within this database.  After a failover occurs, and your packages are configured to use the listener, processes can continue (in the case a failover occurs and a package is running, they do NOT restart or resume).  See here for more details.  Great!

In most cases this works just as specified.  However, what if you have the following scenario?

InstanceAvailability GroupIncludes SSISDB?Availability Replicas
SQLSRV01AG1NoSQLSRV01, SQLSRV02,
SQLSRV03,
SQLSRV04
SQLSRV01AG2YesSQLSRV01,
SQLSRV02
SQLSRV03AG3YesSQLSRV03,
SQLSRV04

In the above example, we follow the instructions and add the SSIS Catalog to the AG2 availability group.  Once added, we complete Step 3 (Enable SSIS support for Always On) http://bit.ly/2OmQMvn.  Everything works as expected, and Auto-failovers can now occur.  Wait.  What does this step do any way?

  • Determines the secondary replica server names.  Notice, this is cluster wide, not specific to any one AG.  More on this later.

  • Creates the dbo.sp_ssis_startup stored procedure, within the master database, on each replica.
  • Set the dbo.sp_ssis_startup job to run whenever SQL Server starts.
  • Sets ‘clr enabled’ to 1 on all secondary replicas.
  • Adds replica information to SSISDB.internal.alwayson_support_state.
  • Creates two logins on all secondary replicas.
    • ##MS_SQLEnableSystemAssemblyLoadingUser##
      • Login is created using the MS_SQLEnableSystemAssemblyLoadingKey asymmetric key.  This key created from the Microsoft.SqlServer.IntegrationServices.Server.dll file.
      • Login is granted the UNSAFE ASSEMBLY permission.
    • ##MS_SSISServerCleanupJobLogin##
  • Creates two SQL Server Agent jobs on all replicas hosting the SSIS Catalog.
    • SSIS Failover Monitor Job – This job runs every 2 minutes and executes the master.dbo.sp_ssis_start procedure if it detects the host replica is now the primary.
    • SSIS Server Maintenance Job – Runs every day.  Job removes operation records from the SSISDB database that are outside the retention window and maintains a maximum number of versions per project.

The AG1 Availability Group is pictured below.

 

 

 

 

 

 

 

 

 

Now that we have an understanding of the process happening when we click “Enable Always On Support” let’s try creating a new availability group between SQLSRV03 and SQLSRV04.  We’ll attempt to add the SSISDB database and enable Always On support.  We’ll name this availability group AG3.

  1.  

 

We are prompted with a reminder “to enable Always On support under Integration Services Catalogs after adding SSISDB into Always On availability group.”  Click ok, next, and then finish.  AG3 has now been created.

Next step, enable Always On support for the Integration Services catalog.  This is done on the primary replica (currently SQLSRV03).

This results in the following error and we’re unable to proceed further.

Problem

One of the first steps the wizard does is to run the following query:

The query above results in the subquery error if the following is true:

  1. Multiple Availability Groups exist within the Windows Server Failover Cluster.  Two of which have a SSISDB database included (although each have separate primary and secondary replicas).
    1. The sys.availability_databases_cluster view contains one row for each availability database on the instance of SQL Server that is hosting an availability replica for any Always On availability group in the Windows Server Failover Clustering cluster, regardless of whether the local copy database has been joined to the availability group yet(http://bit.ly/2OswMHI).  The query will now return two rows resulting in the error: “Subquery returned more than 1 value.  This is not permitted when the subquery follows =, !=, <, <=, >, >= or when the subquery is used as an express. (Microsoft SQL Server, Error:512).”

By the definition of this query, you can only have one Availability Group, within a WSFC, hosting an Integrations Services Catalog.  This is unfortunate because there are scenarios in which you need high availability for multiple catalogs and different replicas.

To support this scenario the query would need to:

  • Join to the sys.availability_replicas view and filter on the replica_server_name column (ar.replica_server_name = @@servername).
  • It only needs to identify secondary replicas, within the availability groups the local server is a member of, that includes the SSISDB database (instead of cluster wide).

Workaround

We’ve identified the issue, how do we work around it?  The following steps can be taken to enable Always On support without using the built in wizard.

Primary Replica

The following scripts need to run on the primary replica.

  1. Add replica information and state to [internal].[alwayson_support_state].

    1.  
  2. Modify permissions for the ##MS_SSISServerCleanupJobUser## user.

    1.  
  3. Create the SSIS Failover Monitor job.

Secondary Replicas

Next, run the following scripts on all secondaries within the Availability Group.  SSISDB should have already been added to the Availability Group.  If any additional replicas are added at a later date, these would need to be applied to them as well.

  1. Enable CLR.
  2. Create the dbo.sp_ssis_startup stored procedure, within the master database.  Set this stored procedure to run whenever SQL Server starts.
  3. Create the MS_SQLEnableSystemAssemblyLoadingKey asymmetric key and the ##MS_SQLEnableSystemAssemblyLoadingUser## login.
  4. Create the ##MS_SSISServerCleanupJobLogin## login.  First, we’ll need to grab the SID of the login on the primary.  Second, we’ll create the login, using the SID from the first step, on all secondary replicas.
  5. Create the SSIS Server Maintenance job.
  6.  Create the SSIS Failover Monitor Job.

    1.  

At this point, review the two jobs created above and ensure they run successfully.  I also deploy a test SSIS project to each replica for verification (deploy project, failover, and redeploy again using the AG listener).

 

Thanks for reading!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.