You are currently viewing Refactoring nightmare Entity Framework query with performance in mind – Part II

Refactoring nightmare Entity Framework query with performance in mind – Part II

I heard that first step to solve a problem is realizing you have one. It took me a while to get there. In the routine of everyday life there is always something more important, more urgent. Investing time in reworking things – especially things you’d like to leave behind you – is never an easy decision. But once you make it, you are already over the hump and things start to fall into place..

Taming the monster

The main problem with this query was that it was simply too complex (are you getting it? 😉) and it made it hard for the code maintainer to understand what’s going on there. As a result, making any change to the code was daunting experience.. It also made it hard for EF to optimize it and it was resulting in poor processing time.. It was clear that the query needed to be split into several more managable chunks whiches results would be merged later. I decided to separate out all the 1-to-many properties into individual queries and to run them all simultaneously with the main query. So first of all I removed all the subqueries from the main query so that it was fetching only Processes :

private async Task<List<Process>> FetchProcessesWithoutGroupings(int TenantId, DateTime dFrom, DateTime dTo, bool? givenTime = null, bool? finishRate = null, bool? handlingsLength = null)
        {
            Task<List<Process>> itemsQuery;
            try
            {

                itemsQuery = (from p in db.JDE_Processes
                              join uuu in db.JDE_Users on p.FinishedBy equals uuu.UserId into finished
                              from fin in finished.DefaultIfEmpty()
                              join t in db.JDE_Tenants on p.TenantId equals t.TenantId
                              join u in db.JDE_Users on p.CreatedBy equals u.UserId
                              join at in db.JDE_ActionTypes on p.ActionTypeId equals at.ActionTypeId
                              join uu in db.JDE_Users on p.StartedBy equals uu.UserId into started
                              from star in started.DefaultIfEmpty()
                              join lsu in db.JDE_Users on p.LastStatusBy equals lsu.UserId into lastStatus
                              from lStat in lastStatus.DefaultIfEmpty()
                              join pl in db.JDE_Places on p.PlaceId equals pl.PlaceId into places
                              from pla in places.DefaultIfEmpty()
                              join comp in db.JDE_Components on p.ComponentId equals comp.ComponentId into comps
                              from components in comps.DefaultIfEmpty()
                              join s in db.JDE_Sets on pla.SetId equals s.SetId into sets
                              from set in sets.DefaultIfEmpty()
                              join a in db.JDE_Areas on pla.AreaId equals a.AreaId into areas
                              from area in areas.DefaultIfEmpty()
                              where p.TenantId == TenantId && p.CreatedOn >= dFrom && p.CreatedOn <= dTo
                              orderby p.CreatedOn descending
                              select new Process
                              {
                                  ProcessId = p.ProcessId,
                                  Description = p.Description,
                                  StartedOn = p.StartedOn,
                                  StartedBy = p.StartedBy,
                                  StartedByName = star.Name + " " + star.Surname,
                                  FinishedOn = p.FinishedOn,
                                  FinishedBy = p.FinishedBy,
                                  FinishedByName = fin.Name + " " + fin.Surname,
                                  ActionTypeId = p.ActionTypeId,
                                  ActionTypeName = at.Name,
                                  IsActive = p.IsActive,
                                  IsFrozen = p.IsFrozen,
                                  IsCompleted = p.IsCompleted,
                                  IsSuccessfull = p.IsSuccessfull,
                                  PlaceId = p.PlaceId,
                                  PlaceName = pla.Name,
                                  PlaceImage = pla.Image,
                                  SetId = pla.SetId,
                                  SetName = set.Name,
                                  AreaId = pla.AreaId,
                                  AreaName = area.Name,
                                  Output = p.Output,
                                  TenantId = p.TenantId,
                                  TenantName = t.TenantName,
                                  CreatedOn = p.CreatedOn,
                                  CreatedBy = p.CreatedBy,
                                  CreatedByName = u.Name + " " + u.Surname,
                                  MesId = p.MesId,
                                  InitialDiagnosis = p.InitialDiagnosis,
                                  RepairActions = p.RepairActions,
                                  Reason = p.Reason,
                                  MesDate = p.MesDate,
                                  Comment = p.Comment,
                                  ComponentId = p.ComponentId,
                                  ComponentName = components.Name,
                                  PlannedStart = p.PlannedStart,
                                  PlannedFinish = p.PlannedFinish,
                                  LastStatus = (ProcessStatus)p.LastStatus,
                                  LastStatusBy = p.LastStatusBy,
                                  LastStatusByName = lStat.Name + " " + lStat.Surname,
                                  LastStatusOn = p.LastStatusOn,
                                  IsResurrected = p.IsResurrected,
                                  HasAttachments = db.JDE_FileAssigns.Any(f => f.ProcessId == p.ProcessId),
                              }).ToListAsync(); ;
            }
            catch (Exception ex)
            {

                throw;
            }
            return await itemsQuery;
        }

Then I created async fetching methods for Handlings, AssignedUsers, GivenTime and AbandonReasons.

private async Task<List<Process>> FetchHandlings()
        {
            using (Models.DbModel _db = new Models.DbModel())
            {
                var items = _db.JDE_Handlings.Where(x => x.ProcessId > 0 && x.ProcessId !=null).Select(h => new
                {
                    ProcessId = h.ProcessId,
                    HandlingId = h.HandlingId,
                    IsCompleted = h.IsCompleted,
                    StartedOn = h.StartedOn,
                    FinishedOn = h.FinishedOn
                }).GroupBy(pa => new { pa.ProcessId }).Select(pr => new Process
                {
                    ProcessId = (int)pr.Key.ProcessId,
                    OpenHandlings = pr.Count(item=>item.HandlingId > 0 && item.IsCompleted != true),
                    AllHandlings = pr.Count(item=>item.HandlingId > 0),
                    HandlingsLength = (from has in pr
                                       where has.ProcessId == pr.Key.ProcessId
                                       select has.FinishedOn == null ? System.Data.Entity.SqlServer.SqlFunctions.DateDiff("n", has.StartedOn, has.FinishedOn).Value : System.Data.Entity.SqlServer.SqlFunctions.DateDiff("n", has.StartedOn, has.FinishedOn).Value).Sum(),
                }).ToListAsync();

                return await items;
            }
        }
private async Task<List<ProcessAssign>> FetchProcessAssigns()
        {
            using (Models.DbModel _db = new Models.DbModel())
            {
                var items = (from pras in _db.JDE_ProcessAssigns
                              join uu in _db.JDE_Users on pras.UserId equals uu.UserId
                              select new ProcessAssign
                              {
                                  ProcessAssignId = pras.ProcessAssignId,
                                  ProcessId = pras.ProcessId,
                                  UserId = pras.UserId,
                                  UserName = uu.Name + " " + uu.Surname
                              }).ToListAsync();

                return await items;
            }
        }
private async Task<List<Process>> FetchGivenTimes()
        {
            using (Models.DbModel _db = new Models.DbModel())
            {
                var items = (from prac in _db.JDE_ProcessActions
                             join a in _db.JDE_Actions on prac.ActionId equals a.ActionId
                             where a.GivenTime != null 
                             select new Process
                             {
                                 ProcessId = (int)prac.ProcessId,
                                 GivenTime = a.GivenTime
                             }).ToListAsync();


                return await items;
            }
        }
private async Task<List<ProcessActionAbandonReason>> FetchAbandonReasons()
        {
            using (Models.DbModel _db = new Models.DbModel())
            {
                var items = (from pas in _db.JDE_ProcessActions
                             join ars in _db.JDE_AbandonReasons on pas.AbandonReasonId equals ars.AbandonReasonId into ar
                             from reasons in ar.DefaultIfEmpty()
                             where pas.AbandonReasonId != null
                             select new ProcessActionAbandonReason
                             {
                                 ProcessId = (int)pas.ProcessId,
                                 AbandonReasonId = pas.AbandonReasonId,
                                 AbandonReasonName = reasons.Name
                             }
                                 ).Distinct().ToListAsync();
                return await items;
            }
        }

Jigsaw falling into place

Ok, so we have few short, easy-to-read and easy-to-optimize subquerries that are fetching all the pieces our master query needs to compose the response. The idea is to execute them all together in the same time asynchronously and await their completion with Task.WhenAll:

var tasks = new List<Task>();
var processesTask = FetchProcessesWithoutGroupings(tenants.FirstOrDefault().TenantId, (DateTime)dFrom, (DateTime)dTo, GivenTime, FinishRate, handlingsLength);
tasks.Add(processesTask);

var handlingsTask = FetchHandlings();
tasks.Add(handlingsTask);

//Set up AbandonReasons fetching
Task<List<ProcessActionAbandonReason>> abandonsTask = GetAbandonReasons();
tasks.Add(abandonsTask);

//Set up AssignedUsers fetching
Task<List<ProcessAssign>> assignedUsersTask = GetProcessAssigns();
tasks.Add(assignedUsersTask);

//Set up GivenTimes fetching
Task<List<Process>> givenTimesTask = GetGivenTimes();
tasks.Add(givenTimesTask);

await Task.WhenAll(tasks);
var items = await processesTask;
var handlings = await handlingsTask;
var abandons = await abandonsTask;
var assigns = await assignedUsersTask;
var givenTimes = await givenTimesTask;

When all data is loaded to appropriate lists, all that left is glueing them back together:

var processes = from process in items
                join handling in handlings on process.ProcessId equals handling.ProcessId into Handlings
                from handling in Handlings.DefaultIfEmpty()
                join abandon in abandons on process.ProcessId equals abandon.ProcessId into ProcessAbandons
                join assign in assigns on process.ProcessId equals assign.ProcessId into ProcessAssigns
                join givenTime in givenTimes on process.ProcessId equals givenTime.ProcessId into GivenTimes
                select new Process
                {
                      ProcessId = process.ProcessId,
                      Description = process.Description,
                      StartedOn = process.StartedOn,
                      StartedBy = process.StartedBy,
                      StartedByName = process.StartedByName,
                      FinishedOn = process.FinishedOn,
                      FinishedBy = process.FinishedBy,
                      FinishedByName = process.FinishedByName,
                      ActionTypeId = process.ActionTypeId,
                      ActionTypeName = process.ActionTypeName,
                      IsActive = process.IsActive,
                      IsFrozen = process.IsFrozen,
                      IsCompleted = process.IsCompleted,
                      IsSuccessfull = process.IsSuccessfull,
                      PlaceId = process.PlaceId,
                      PlaceName = process.PlaceName ?? "",
                      PlaceImage = process.PlaceImage ?? "",
                      SetId = process.SetId,
                      SetName = process.SetName ?? "",
                      AreaId = process.AreaId,
                      AreaName = process.AreaName ?? "",
                      Output = process.Output ?? "",
                      TenantId = process.TenantId,
                      TenantName = process.TenantName,
                      CreatedOn = process.CreatedOn,
                      CreatedBy = process.CreatedBy,
                      CreatedByName = process.CreatedByName,
                      MesId = process.MesId,
                      InitialDiagnosis = process.InitialDiagnosis ?? "",
                      RepairActions = process.RepairActions ?? "",
                      Reason = process.Reason ?? "",
                      MesDate = process.MesDate,
                      Comment = process.Comment ?? "",
                      ComponentId = process.ComponentId,
                      ComponentName = process.ComponentName,
                      PlannedStart = process.PlannedStart,
                      PlannedFinish = process.PlannedFinish,
                      LastStatus = process.LastStatus,
                      LastStatusBy = process.LastStatusBy,
                      LastStatusByName = process.LastStatusByName,
                      LastStatusOn = process.LastStatusOn,
                      IsResurrected = process.IsResurrected,
                      OpenHandlings = handling == null ? 0 : handling.OpenHandlings,
                      AllHandlings = handling == null ? 0 : handling.AllHandlings,
                      AssignedUsers = ProcessAssigns.Select(x => x.UserName).AsQueryable(),
                      GivenTime = GivenTimes.Sum(x => x.GivenTime) == 0 ? null : GivenTimes.Sum(x => x.GivenTime),
                      HasAttachments = process.HasAttachments,
                      HandlingsLength = handling == null ? null : handling.HandlingsLength,
                      AbandonReasons = ProcessAbandons.Select(x => x.AbandonReasonName).AsQueryable(),
                      AbandonReasonNames = string.Join(",", ProcessAbandons.Select(x => x.AbandonReasonName))
                };

So, the first goal – making the query more readable and easier to maintain – has been accomplished. More important, though, is how it impacts processing time of our query. Is it at least equally fast? I’m going to do a small benchmark of both old and new queries and I’ll post my findings in 3rd part of this aricle – stay tuned! 😉

Dodaj komentarz