- 11 Aug, 2014 23 commits
-
-
Oscar Mateo authored
For the most part, logical ring context objects are similar to hardware contexts in that the backing object is meant to be opaque. There are some exceptions where we need to poke certain offsets of the object for initialization, updating the tail pointer or updating the PDPs. For our basic execlist implementation we'll only need our PPGTT PDs, and ringbuffer addresses in order to set up the context. With previous patches, we have both, so start prepping the context to be load. Before running a context for the first time you must populate some fields in the context object. These fields begin 1 PAGE + LRCA, ie. the first page (in 0 based counting) of the context image. These same fields will be read and written to as contexts are saved and restored once the system is up and running. Many of these fields are completely reused from previous global registers: ringbuffer head/tail/control, context control matches some previous MI_SET_CONTEXT flags, and page directories. There are other fields which we don't touch which we may want in the future. v2: CTX_LRI_HEADER_0 is MI_LOAD_REGISTER_IMM(14) for render and (11) for other engines. v3: Several rebases and general changes to the code. v4: Squash with "Extract LR context object populating" Also, Damien's review comments: - Set the Force Posted bit on the LRI header, as the BSpec suggest we do. - Prevent warning when compiling a 32-bits kernel without HIGHMEM64. - Add a clarifying comment to the context population code. v5: Damien's review comments: - The third MI_LOAD_REGISTER_IMM in the context does not set Force Posted. - Remove dead code. v6: Add a note about the (presumed) differences between BDW and CHV state contexts. Also, Brad's review comments: - Use the _MASKED_BIT_ENABLE, upper_32_bits and lower_32_bits macros. - Be less magical about how we set the ring size in the context. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com> (v2) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Any given ringbuffer is unequivocally tied to one context and one engine. By setting the appropriate pointers to them, the ringbuffer struct holds all the infromation you might need to submit a workload for processing, Execlists style. v2: Drop ring->ctx since that looks terribly ill-defined for legacy ringbuffer submission. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v1) Acked-by: Damien Lespiau <damien.lespiau@intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
As we have said a couple of times by now, logical ring contexts have their own ringbuffers: not only the backing pages, but the whole management struct. In a previous version of the series, this was achieved with two separate patches: drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC drm/i915/bdw: Allocate ringbuffer for user-created LRCs Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
Now that we have the ability to allocate our own context backing objects and we have multiplexed one of them per engine inside the context structs, we can finally allocate and free them correctly. Regarding the context size, reading the register to calculate the sizes can work, I think, however the docs are very clear about the actual context sizes on GEN8, so just hardcode that and use it. v2: Rebased on top of the Full PPGTT series. It is important to notice that at this point we have one global default context per engine, all of them using the aliasing PPGTT (as opposed to the single global default context we have with legacy HW contexts). v3: - Go back to one single global default context, this time with multiple backing objects inside. - Use different context sizes for non-render engines, as suggested by Damien (still hardcoded, since the information about the context size registers in the BSpec is, well, *lacking*). - Render ctx size is 20 (or 19) pages, but not 21 (caught by Damien). - Move default context backing object creation to intel_init_ring (so that we don't waste memory in rings that might not get initialized). v4: - Reuse the HW legacy context init/fini. - Create a separate free function. - Rename the functions with an intel_ preffix. v5: Several rebases to account for the changes in the previous patches. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
A context backing object only makes sense for a given engine (because it holds state data specific to that engine). In legacy ringbuffer sumission mode, the only MI_SET_CONTEXT we really perform is for the render engine, so one backing object is all we nee. With Execlists, however, we need backing objects for every engine, as contexts become the only way to submit workloads to the GPU. To tackle this problem, we multiplex the context struct to contain <no-of-engines> objects. Originally, I colored this code by instantiating one new context for every engine I wanted to use, but this change suggested by Brad Volkin makes it more elegant. v2: Leave the old backing object pointer behind. Daniel Vetter suggested using a union, but it makes more sense to keep rcs_state as a NULL pointer behind, to make sure no one uses it incorrectly when Execlists are enabled, similar to what he suggested for ring->buffer (Rusty's API level 5). v3: Use the name "state" instead of the too-generic "obj", so that it mirrors the name choice for the legacy rcs_state. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
For the moment this is just a placeholder, but it shows one of the main differences between the good ol' HW contexts and the shiny new Logical Ring Contexts: LR contexts allocate and free their own backing objects. Another difference is that the allocation is deferred (as the create function name suggests), but that does not happen in this patch yet, because for the moment we are only dealing with the default context. Early in the series we had our own gen8_gem_context_init/fini functions, but the truth is they now look almost the same as the legacy hw context init/fini functions. We can always split them later if this ceases to be the case. Also, we do not fall back to legacy ringbuffers when logical ring context initialization fails (not very likely to happen and, even if it does, hw contexts would probably fail as well). v2: Daniel says "explain, do not showcase". Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: s/BUG_ON/WARN_ON/.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Depending upon one module option to be sanitized (through USES_PPGTT) for the other is a bit too fragile for my taste. At least WARN about this. Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts". These expanded contexts enable a number of new abilities, especially "Execlists". The macro is defined to off until we have things in place to hope to work. v2: Rename "advanced contexts" to the more correct "logical ring contexts". v3: Add a module parameter to enable execlists. Execlist are relatively new, and so it'd be wise to be able to switch back to ring submission to debug subtle problems that will inevitably arise. v4: Add an intel_enable_execlists function. v5: Sanitize early, as suggested by Daniel. Remove lrc_enabled. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v3) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2, v4 & v5) Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
Some legacy HW context code assumptions don't make sense for this new submission method, so we will place this stuff in a separate file. Note for reviewers: I've carefully considered the best name for this file and this was my best option (other possibilities were intel_lr_context.c or intel_execlist.c). I am open to a certain bikeshedding on this matter, anyway. And some point in time, it would be a good idea to split intel_lrc.c/.h even further, but for the moment just shove everything together. v2: Change to intel_lrc.c v3: Squash together with the header file addition Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Even though we should not try to use 4+GiB GTTs on 32-bit systems, by using a local variable we can future proof the code whilst making it easier to read. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Appease checkpatch a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Part of the pre-validation for an execbuffer call is that there is at least one object in the execlist. As we bail if we fail to lookup any object, we can be sure that after the eb_lookup_vma() there is at least one object in the vma list and so we do not need to assert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
We have an implementation requirement that precludes the user from requesting a ggtt entry when the device is operating in ppgtt mode. Move the current check from inside the execbuffer object collation to the prevalidation phase. v2: Roll both invalid flags checks into one Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Based upon a hunk from a patch from Chris Wilson, but augmented to: - Process the batch in the full ppgtt vm so that self-relocations match again with userspace's expectations.. - Add a comment why plain pin for the global gtt binding is safe at that point. v2: Drop local bind_vm variable (Chris). v3: Explain why this works despite the lack of proper active tracking for the ggtt batch vma. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Adapt the macro so that we can pass either the struct drm_device or the struct drm_i915_private pointers and get the answer we want. Over time, my plan is to convert all users over to using drm_i915_private and so trimming down the pointer dance. Having spent a few hours chasing that goal and achieved over 8k of object code saving, it appears to be a worthwhile target. This interim macro allows us to slowly convert over. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Drop the (struct drm_device *) cast per the m-l discussion. Also explain the seemingly unecessary first cast.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
During ring initialisation, sometimes we observe, though not in production hardware, that the idle flag is not set even though the ring is empty. Double check before giving up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Found with sparse. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
This is so that we can make the drm_i915_private->info always the preferred source for chipset type and feature queries. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
This migrates the fence tracking onto the existing seqno infrastructure so that the later conversion to tracking via requests is simplified. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Move the decision on whether we need to have a mappable object during execbuffer to the fore and then reuse that decision by propagating the flag through to reservation. As a corollary, before doing the actual relocation through the GTT, we can make sure that we do have a GTT mapping through which to operate. Note that the key to make this work is to ditch the obj->map_and_fenceable unbind optimization - with full ppgtt it doesn't make a lot of sense any more anyway. v2: Revamp and resend to ease future patches. v3: Refresh patch rationale References: https://bugs.freedesktop.org/show_bug.cgi?id=81094Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: Explain why obj->map_and_fenceable is key and split out the secure batch fix.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
If an object is not bound into the global GTT, then it cannot be accessed via the GTT. This restores the original code that was muddled by ppGTT. In the process, we remove a WARN that had long outlived its usefulness and was simply being coded around instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
I keep telling myself that those tables aren't great because their size is the number of dwords we need to program and not the number of entries (number of dwords = number of entries * 2). And... I got it wrong when I refactored the code. Fortunately, it was only wrong when the VBT table (or the code parsing it) is itself erroneous. Long story short, it shouldn't matter, but still, there's a potential array overflow and random programming of the DDI translation tables. Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Sonika Jindal authored
Removing the check for HAS_PCH_SPLIT, it looks redundant here. Anyways all the platforms are checked separately. v2: Reordering as per the gen (Ville) Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 09 Aug, 2014 1 commit
-
-
Paulo Zanoni authored
Currently, if the machine is runtime suspended an you read the file, you will get an "Unclaimed register" error message. Testcase: igt/pm_rpm/debugfs-read Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 08 Aug, 2014 16 commits
-
-
Ville Syrjälä authored
Make the intel_{enable,disable}_primary_hw_plane() simply call .update_primary_plane(), thus eliminating the rmw from these functions which should help the poor old 830M. Now we can also remove the .update_primary_plane() from the .crtc_enable() hooks because we end up calling it via intel_crtc_enable_planes()->intel_enable_primary_hw_plane(). This also has the nice benefit of making primary planes a bit closer to the way we handle sprite planes during modesets. v2: Just write 0 to DSPCNTR and DSPSURF/DSPADDR if the plane is (to be) disabled. Quicker, and more importantly avoids an oops when fb==NULL due to BIOS fb takeover failure. Pimp the commit message a bit (Matt) v3: Drop useless primary_enabled checks when setting DISPLAY_PLANE_ENABLE Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Move the entire DSPCNTR register setup into the .update_primary_plane() functions. That's where it belongs anyway and it'll also help 830M which has the extra problem that plane registers reads will return the value latched at the last vblank, not the value that was last written. Also move DSPPOS and DSPSIZE setup there. v2: Don't move variable initialization to avoid churn later Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
adj was defined as u8. The issue is last_adj can be negative and adj is initialized with: adj = dev_priv->rps.last_adj; and we were also happily doing things like: if (adj < 0) (thank static analysers!) v2: Make new_delay an int in case we overflow the u8 in the intermediate computations. new_delay will get clamped at the end anyway. (Ville) Cc: Deepak S <deepak.s@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Doing a 1s wait (tops) with the cpu is a bit excessive. Tune it down like everything else in that code. v2: Also insert the missing space Chris spotted. Cc: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Gajanan Bhat authored
Program DDL register as part of sprite watermark programming for CHV and VLV. v2: Rename DRAIN_LATENCY_MAX by DRAIN_LATENCY_MASK v3: Addressed review comments by Ville - Changed Sprite DDL definitions to more generic to avoid multiple if-else - Changed bit masking to customary form - Changed to bitwise shorthand operator for sprite_dl assignment Signed-off-by: Gajanan Bhat <gajanan.bhat@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Gajanan Bhat authored
Round up clock computation and limit drain latency to maximum of 0x7F. Signed-off-by: Gajanan Bhat <gajanan.bhat@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Gajanan Bhat authored
Modify drain latency computation to use it for any plane. Same function can be used for primary, cursor and sprite planes. v2: Adressed review comments by Imre and Ville. - Moved clock round up in separate patch - Added WARN check for clock and pixel size - Simplified bit masking - Use cursor_base instead of reg read v3: Changed to bitwise shorthand operator for plane_dl assignment. Signed-off-by: Gajanan Bhat <gajanan.bhat@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
If there are pending page flips when the fd gets closed those page flips may have events associated to them. When the page flip eventually completes it will queue the event to file_priv->event_list, but that may be too late and file_priv->event_list has already been cleaned up. Thus we leak a bit of kernel memory in the form of the event structure. To avoid such problems clear out such pending events from intel_crtc->unpin_work at ->preclose(). Any event that already made it to file_priv->event_list will get cleaned up by the drm_release_events() a bit later. We can ignore the file_priv->event_space accounting since file_priv is going away. This is already how drm core deals with pending vblank events, which are maintained by the drm core. What saves us from a total disaster (ie. dereferencing and alrady freed file_priv) is the fact that the fb descruction triggers a modeset and there we wait for pending flips. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jesse Barnes authored
sanitize_enable_ppgtt is the function that checks all the conditions, honoring a forced ppgtt status or doing auto-detect as necessary. Just make sure it returns the right value in all cases and use that in the macros instead of the confusing intel_enable_ppgtt() function. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [danvet: Don't reenable full ppgtt through the backdoor.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Replace the semi-funky cmnlane assert/deassert macros with something a bit more conventional. Also protect the macro arguments properly (also for PHY_POWERGOOD()). Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
It looks like frobbing the cmnreset line on pne PHY disturbs the other PHY on chv. The result is a black screen. On HDMI it's just a flash of black, but DP usually falls over and can't get back up. As a workaround set up the power domains so that both common lane wells power up and down together. I also tried leaving the cmnreset deasserted even the if the power well goes down but that didn't seem acceptable to the PHY. Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
CHV has a third pipe so we need to compute the watermarks for its planes. Add cherryview_update_wm() to do just that. v2: Rebase on top of Imre's cxsr changes v3: Pass crtc to vlv_update_drain_latency() Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Gajanan Bhat authored
Instead of looping through all CRTCs, update DDL for current CRTC for which watermark is being updated. CHV is confirmed to have precision of 32/64 which is same as VLV. Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Gajanan Bhat <gajanan.bhat@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
The VLV/CHV DDL registers are uniform, and neatly enough the register offsets are sane so we can easily unify them to a single set of defines and just pass the pipe as the parameter to compute the register offset. Note that we now fill out the drain latency for pipe C on CHV which we didn't do before. The rest of the pipe C watermarks are still untouched but that will be remedied later by adding a proper cherryview_update_wm() function. v2: Add a note about CHV pipe C changes (Paulo) Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Add defines for all the watermark registers on modernish gmch platforms. VLV has increased the number of bits available for certain watermaks so expand the masks appropriately. Also vlv and chv have added some extra FW registers. Not sure what happened on chv because a new register called FW9 is now at the offset where FW7 was on vlv, while FW7 and FW8 (another new register) have been moved off somewhere else. Oh well, well just need two defines for FW7 then. v2: Fix DSPHOWM1 offset (Paulo) Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-