Dependency injection is implemented and used in dotnet core: 5. Use HtmlEncoder that supports Unicode

phenomenon

In ASP.NET Core MVC, when a string containing Chinese characters is passed to the page, the display of the page is normal, but if you view the source code of the page, you cannot see the Chinese characters, and it becomes a string of encoding. Content.
For example, directly define a string containing Chinese content in the page, and then display it on the page.

@{
    ViewData["Title"] = "Home Page";
     string world = "world";
}

<div class="text-center">
    <h1 class="display-4">Welcome</h1>
    <p>Hello @world.</p>
</div>

After running, you can see that the page is normal

But when viewing the source code of the page, the Chinese text is missing.

<p>Hello, &#x4E16; &#x754C;.</p>

The reason is that the content of the string is encoded by code and output.

Analysis

In asp.net core, based on security considerations to prevent XSS attacks, all non-basic characters (U + 0000..U + 007F) are encoded by default. So basically except for the English letters, everything else is encoded.

This control comes from UnicodeRange in HtmlEncoder being set to UnicodeRanges.BasicLatin.

Encoding uses the HtmlEncoder class. It is located in the project System.Text.Encodings.Web at GitHub. By default it uses DefaultHtmlEncoder for encoding, but it encodes based on Latin characters. Here is the source code snippet:

internal sealed class DefaultHtmlEncoder : HtmlEncoder
    {
        private readonly AllowedCharactersBitmap _allowedCharacters;
        internal static readonly DefaultHtmlEncoder Singleton = new DefaultHtmlEncoder(new TextEncoderSettings(UnicodeRanges.BasicLatin));

as well as:

public abstract class HtmlEncoder : TextEncoder
    {
        /// <summary>
        /// Returns a default built-in instance of <see cref="HtmlEncoder"/>.
        /// </summary>
        public static HtmlEncoder Default
        {
            get { return DefaultHtmlEncoder.Singleton; }
        }

Click here to view the full source code.

Solution

Configure to widen the UnicodeRange range. We can solve it by replacing the encoder with a Unicode encoder that supports Chinese.

In ASP.NET Core, various services are provided through dependency injection. Therefore, we can solve this problem by adjusting the services registered in the container.

The source code actually injects an HtmlEncoder to handle encoding issues.

public DefaultHtmlGenerator(
      IAntiforgery antiforgery,
      IOptions<MvcViewOptions> optionsAccessor,
      IModelMetadataProvider metadataProvider,
      IUrlHelperFactory urlHelperFactory,
      HtmlEncoder htmlEncoder,
      ClientValidatorCache clientValidatorCache)
{
}

See the source code: https://github.com/dotnet/aspnetcore/blob/master/src/Mvc/Mvc.ViewFeatures/src/DefaultHtmlGenerator.cs

ASP.NET Core itself has registered the HtmlEncoder service and provides configuration using WebEncoderOptions.
The code looks like this:

public static IServiceCollection AddWebEncoders(this IServiceCollection services)
        {
            if (services == null)
            {
                throw new ArgumentNullException(nameof(services));
            }

            services.AddOptions();

            // Register the default encoders
            // We want to call the 'Default' property getters lazily since they perform static caching
            services.TryAddSingleton(
                CreateFactory(() => HtmlEncoder.Default, settings => HtmlEncoder.Create(settings)));
            services.TryAddSingleton(
                CreateFactory(() => JavaScriptEncoder.Default, settings => JavaScriptEncoder.Create(settings)));
            services.TryAddSingleton(
                CreateFactory(() => UrlEncoder.Default, settings => UrlEncoder.Create(settings)));

            return services;
        }
...
        private static Func<IServiceProvider, TService> CreateFactory<TService>(
            Func<TService> defaultFactory,
            Func<TextEncoderSettings, TService> customSettingsFactory)
        {
            return serviceProvider =>
            {
                var settings = serviceProvider
                    ?.GetService<IOptions<WebEncoderOptions>>()
                    ?.Value
                    ?.TextEncoderSettings;
                return (settings != null) ? customSettingsFactory(settings) : defaultFactory();
            };
        }

A factory is used here to create three objects including HtmlEncoder. The factory will also try to find the WebEncoderOptions configuration object during the creation process. If there is one, it will use it as a configuration parameter to create these three objects. Otherwise, the default method will be used.

In addition, an extension method is provided that can register WebEncoderOptions options, providing support for using Options mode. It allows you to configure the encoding range currently used.

public static IServiceCollection AddWebEncoders(this IServiceCollection services, Action<WebEncoderOptions> setupAction)
        {
            if (services == null)
            {
                throw new ArgumentNullException(nameof(services));
            }

            if (setupAction == null)
            {
                throw new ArgumentNullException(nameof(setupAction));
            }

            services.AddWebEncoders();
            services.Configure(setupAction);

            return services;
        }

See the source code: https://github.com/dotnet/aspnetcore/blob/master/src/WebEncoders/src/EncoderServiceCollectionExtensions.cs

Therefore, there are two options to choose from. One is to start directly with HtmelEncoder and provide a new HtmlEncoder implementation. The better way is to use the Options mode to provide appropriate parameters when creating.

There are several ways to consider, here are 5 ways.

The first way is to directly register another service of the same type. If you register multiple services of the same type, the last registered service will be used when injecting a single service instance. Therefore, we can register another HtmlEncoder type. service can solve this problem.
code show as below:

services.AddSingleton(HtmlEncoder.Create(UnicodeRanges.All));

This method has the disadvantage of creating one more object instance.

The second way is to directly replace the original HtmlEncoder service, which can be achieved by using the extension method Replace() of ServiceCollection, which accepts a parameter of type ServiceDescriptor for replacement. This is more thorough.

var descriptor =
    new ServiceDescriptor(
        typeof(HtmlEncoder),
        HtmlEncoder.Create(UnicodeRanges.All));
services.Replace(descriptor);

At this time, only one expanded HtmlEncoder is in use. There are no redundant HtmlEncoder objects at this time.
It should be noted that the Replace() extension method is located under the namespace System.Text.Encodings.Web, and UnicodeRanges is located under System.Text.Unicode. Remember to add references to the two namespaces.

using System.Text.Encodings.Web;
using System.Text.Unicode;

The third type, since it can be configured through the Options mode, can also be processed based on the Options mode. It does this using the Configure method, which provides configuration WebEncoderOptions objects through Actions. In this way, all three objects can be used directly. The method is defined as follows:

public static IServiceCollection Configure<TOptions>(
      this IServiceCollection services,
      Action<TOptions> configureOptions) where TOptions : class;

The implementation is as follows:

services.Configure<Microsoft.Extensions.WebEncoders.WebEncoderOptions>(
    options =>
        options.TextEncoderSettings = new TextEncoderSettings(UnicodeRanges.All));

In this way, the configuration object will be used when creating three objects such as HtmlEncoder.

The fourth method is still based on Options mode. It is configured after the Configure() method. See:

services.PostConfigure<Microsoft.Extensions.WebEncoders.WebEncoderOptions>(
    options =>
        options.TextEncoderSettings = new TextEncoderSettings(UnicodeRanges.All));

The effect is the same as #3. But PostConfigure() will ensure that it is executed after the Configure() method, which is better than the third method.

The last but best way, as you can see from the source code, you can also use the fifth way. The system has provided the extension method AddWebEncoders. Actually the same as #3.

services.AddWebEncoders(options => options.TextEncoderSettings = new TextEncoderSettings(UnicodeRanges.All));

This way is better. More semantic.